Commit Graph

174 Commits

Author SHA1 Message Date
Han Gao
62225503f5 configs: sync use ETNAVIV instead of Galcore
Signed-off-by: Han Gao <gaohan@iscas.ac.cn>
2023-12-08 11:29:59 +08:00
Han Gao
dddd938847 config: ahead: enable bluetooth
Signed-off-by: Han Gao <gaohan@iscas.ac.cn>
2023-12-08 11:28:26 +08:00
Robert Nelson
49c39d9e97 config: enable SERIAL_DEV_BUS/BT_HCIUART_BCM/BT_BCM
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
2023-12-08 11:28:26 +08:00
Robert Nelson
918e660a58 cleanup: remove random BT_HCIUART_RTL3WIRE driver, sync back with v5.10.113
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
2023-12-08 11:28:26 +08:00
Icenowy Zheng
b8c5d35460 revyos_defconfig: use ETNAVIV instead of Galcore
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
2023-12-08 11:24:07 +08:00
Icenowy Zheng
afcdc418d4 drm/etnaviv: hack: use only pta id 0
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
2023-12-08 11:24:07 +08:00
Icenowy Zheng
d6f09caa32 light: use etnaviv
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
2023-12-08 11:24:07 +08:00
Icenowy Zheng
727e6f3be6 galcore: adapt to vivante,gc
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
2023-12-08 11:24:07 +08:00
Icenowy Zheng
2f78e6b748 drm/etnaviv: add GC620
Dirty.

Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
2023-12-08 11:24:07 +08:00
Icenowy Zheng
cfe4413691 drm/etnaviv: add hwdb entry for TH1520 GC620
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
2023-12-08 11:24:07 +08:00
Icenowy Zheng
373e8161c5 drm/etnaviv: add workaround for GC620 on TH1520 (0x5552)
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
2023-12-08 11:24:07 +08:00
Icenowy Zheng
9ba56a64a8 drm/etnaviv: add handle for GPUs with only SECURITY_AHB flag
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
2023-12-08 11:24:07 +08:00
Lucas Stach
ad542ee013 drm/etnaviv: expedited MMU fault handling
The GPU is halted when it hits a MMU exception, so there is no point in
waiting for the job timeout to expire or try to work out if the GPU is
still making progress in the timeout handler, as we know that the GPU
won't make any more progress.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
2023-12-08 11:24:07 +08:00
Lucas Stach
e44f708a9e drm/etnaviv: drop GPU initialized property
Now that it is only used to track the driver internal state of
the MMU global and cmdbuf objects, we can get rid of this property
by making the free/finit functions of those objects safe to call
on an uninitialized object.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
2023-12-08 11:24:07 +08:00
Lucas Stach
ccbbd8ae8f drm/etnaviv: better track GPU state
Instead of only tracking if the FE is running, use a enum to better
describe the various states the GPU can be in. This allows some
additional validation to make sure that functions that expect a
certain GPU state are only called when the GPU is actually in that
state.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
2023-12-08 11:24:07 +08:00
Lucas Stach
e286235a75 drm/etnaviv: avoid runtime PM usage in etnaviv_gpu_bind
Nothing in this callpath actually touches the GPU, so there is no reason
to get it out of suspend state here. Only if runtime PM isn't enabled at
all we must make sure to enable the clocks, so the GPU init routine can
access the GPU later on.

This also removes the need to guard against the state where the driver
isn't fully initialized yet in the runtime PM resume handler.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
2023-12-08 11:24:07 +08:00
Lucas Stach
e084b32587 drm/etnaviv: slow down FE idle polling
Currently the FE is spinning way too fast when polling for new work in
the FE idleloop. As each poll fetches 16 bytes from memory, a GPU running
at 1GHz with the current setting of 200 wait cycle between fetches causes
80 MB/s of memory traffic just to check for new work when the GPU is
otherwise idle, which is more FE traffic than in some GPU loaded cases.

Significantly increase the number of wait cycles to slow down the poll
interval to ~30µs, limiting the FE idle memory traffic to 512 KB/s, while
providing a max latency which should not hurt most use-cases. The FE WAIT
command seems to have some unknown discrete steps in the wait cycles so
we may over/undershoot the target a bit, but that should be harmless.

If the GPU core base frequency is unknown keep the 200 wait cycles as
a sane default.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Sui Jingfeng <suijingfeng@loongson.cn>
Tested-by: Sui Jingfeng <suijingfeng@loongson.cn>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
2023-12-08 11:24:07 +08:00
Lucas Stach
880ab177a1 drm/etnaviv: split fence lock
The fence lock currently protects two distinct things. It protects the fence
IDR from concurrent inserts and removes and also keeps drm_sched_job_arm and
drm_sched_entity_push_job in one atomic section to guarantee the fence seqno
monotonicity. Split the lock into those two functions.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
2023-12-08 11:24:07 +08:00
Christian Gmeiner
c32beb855f drm/etnaviv: print MMU exception cause
The MMU tells us the fault status. While the raw register value is
already printed, it's a bit more user friendly to translate the
fault reasons into human readable format.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2023-12-08 11:24:07 +08:00
Lucas Stach
8304fb3b0e drm/etnaviv: switch to PFN mappings
There is no reason to use page based mappings, as the established
mappings are special driver mappings anyways and should not be
handled like normal pages.

Be consistent with what other drivers do and use raw PFN based
mappings.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2023-12-08 11:24:07 +08:00
Lucas Stach
51464f6592 drm/etnaviv: reap idle mapping if it doesn't match the softpin address
When a idle BO, which is held open by another process, gets freed by
userspace and subsequently referenced again by e.g. importing it again,
userspace may assign a different softpin VA than the last time around.
As the kernel GEM object still exists, we likely have a idle mapping
with the old VA still cached, if it hasn't been reaped in the meantime.

As the context matches, we then simply try to resurrect this mapping by
increasing the refcount. As the VA in this mapping does not match the
new softpin address, we consequently fail the otherwise valid submit.
Instead of failing, reap the idle mapping.

Cc: stable@vger.kernel.org # 5.19
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Guido Günther <agx@sigxcpu.org>
2023-12-08 11:24:07 +08:00
Lucas Stach
f87b29df53 drm/etnaviv: move idle mapping reaping into separate function
The same logic is already used in two different places and now
it will also be needed outside of the compilation unit, so split
it into a separate function.

Cc: stable@vger.kernel.org # 5.19
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Guido Günther <agx@sigxcpu.org>
2023-12-08 11:24:07 +08:00
Christian Gmeiner
c71d194901 drm/etnaviv: print offender task information on hangcheck recovery
Track the pid per submit, so we can print the name and cmdline of
the task which submitted the batch that caused the gpu to hang.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2023-12-08 11:24:07 +08:00
Lucas Stach
dc96c0fe68 drm/etnaviv: reap idle softpin mappings when necessary
Right now the only point where softpin mappings get removed from the
MMU context is when the mapped GEM object is destroyed. However,
userspace might want to reuse that address space before the object
is destroyed, which is a valid usage, as long as all mapping in that
region of the address space are no longer used by any GPU jobs.

Implement reaping of idle MMU mappings that would otherwise
prevent the insertion of a softpin mapping.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Guido Günther <agx@sigxcpu.org>
Acked-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2023-12-08 11:24:07 +08:00
Lucas Stach
5ffd7f5029 drm/etnaviv: move flush_seq increment into etnaviv_iommu_map/unmap
The flush sequence is a marker that the page tables have been changed
and any affected TLBs need to be flushed. Move the flush_seq increment
a little further down the call stack to place it next to the actual
page table manipulation. Not functional change.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Tested-by: Guido Günther <agx@sigxcpu.org>
Acked-by: Guido Günther <agx@sigxcpu.org>
2023-12-08 11:24:07 +08:00
Lucas Stach
0f89c7db16 drm/etnaviv: move MMU context ref/unref into map/unmap_gem
This makes it a little more clear that the mapping holds a reference
to the context once the buffer has been successfully mapped into that
context and simplifies the error handling a bit.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Tested-by: Guido Günther <agx@sigxcpu.org>
Acked-by: Guido Günther <agx@sigxcpu.org>
2023-12-08 11:24:07 +08:00
Michael Walle
4a4a708db1 drm/etnaviv: use a 32 bit mask as coherent DMA mask
The STLB and the first command buffer (which is used to set up the TLBs)
has a 32 bit size restriction in hardware. There seems to be no way to
specify addresses larger than 32 bit. Keep it simple and restict the
addresses to the lower 4 GiB range for all coherent DMA memory
allocations.

Please note, that platform_device_alloc() will initialize dev->dma_mask
to point to pdev->platform_dma_mask, thus dma_set_mask() will work as
expected.

While at it, move the dma_mask setup code to the of_dma_configure() to
keep all the DMA setup code next to each other.

Suggested-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2023-12-08 11:24:07 +08:00
Christian Gmeiner
255c988024 drm/etnaviv: provide more ID values via GET_PARAM ioctl.
Make it possible for the user space to access these ID values.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2023-12-08 11:24:07 +08:00
Guido Günther
af2602ac87 drm/etnaviv: Add lockdep annotations for context lock
etnaviv_iommu_find_iova has it so etnaviv_iommu_insert_exact and
lockdep_assert_held should have it as well.

Signed-off-by: Guido Günther <agx@sigxcpu.org>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2023-12-08 11:24:07 +08:00
Lucas Stach
b6d74c3560 drm/etnaviv: fix dumping of active MMU context
[ Upstream commit 20faf2005ec85fa1a6acc9a74ff27de667f90576 ]

gpu->mmu_context is the MMU context of the last job in the HW queue, which
isn't necessarily the same as the context from the bad job. Dump the MMU
context from the scheduler determined bad submit to make it work as intended.

Fixes: 17e4660ae3d7 ("drm/etnaviv: implement per-process address spaces on MMUv2")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-08 11:24:07 +08:00
Lucas Stach
8f3347a598 drm/etnaviv: fix reference leak when mmaping imported buffer
commit 963b2e8c428f79489ceeb058e8314554ec9cbe6f upstream.

drm_gem_prime_mmap() takes a reference on the GEM object, but before that
drm_gem_mmap_obj() already takes a reference, which will be leaked as only
one reference is dropped when the mapping is closed. Drop the extra
reference when dma_buf_mmap() succeeds.

Cc: stable@vger.kernel.org
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-08 11:24:07 +08:00
Lucas Stach
9aa4002cc9 drm/etnaviv: don't truncate physical page address
[ Upstream commit d37c120b73128690434cc093952439eef9d56af1 ]

While the interface for the MMU mapping takes phys_addr_t to hold a
full 64bit address when necessary and MMUv2 is able to map physical
addresses with up to 40bit, etnaviv_iommu_map() truncates the address
to 32bits. Fix this by using the correct type.

Fixes: 931e97f3afd8 ("drm/etnaviv: mmuv2: support 40 bit phys address")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-08 11:24:07 +08:00
Doug Brown
e025348004 drm/etnaviv: add missing quirks for GC300
[ Upstream commit cc7d3fb446a91f24978a6aa59cbb578f92e22242 ]

The GC300's features register doesn't specify that a 2D pipe is
available, and like the GC600, its idle register reports zero bits where
modules aren't present.

Signed-off-by: Doug Brown <doug@schmorgal.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-08 11:24:07 +08:00
Lucas Stach
58f804da14 drm/etnaviv: check for reaped mapping in etnaviv_iommu_unmap_gem
commit e168c25526cd0368af098095c2ded4a008007e1b upstream.

When the mapping is already reaped the unmap must be a no-op, as we
would otherwise try to remove the mapping twice, corrupting the involved
data structures.

Cc: stable@vger.kernel.org # 5.4
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Tested-by: Guido Günther <agx@sigxcpu.org>
Acked-by: Guido Günther <agx@sigxcpu.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-08 11:24:07 +08:00
Thomas Zimmermann
277f5bc2bc drm/etnaviv: Introduce GEM object functions
GEM object functions deprecate several similar callback interfaces in
struct drm_driver. This patch replaces the per-driver callbacks with
per-instance callbacks in etnaviv. The only exception is gem_prime_mmap,
which is non-trivial to convert.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200923102159.24084-4-tzimmermann@suse.de
2023-12-08 11:24:07 +08:00
Lucas Stach
2dc08a7dac drm/etnaviv: rework linear window offset calculation
The current calculation based on the required_dma mask can be significantly
off, so that the linear window only overlaps a small part of the DRAM
address space. This can lead to the command buffer being unmappable, which
is obviously bad.

Rework the linear window offset calculation to be based on the command buffer
physical address, making sure that the command buffer is always mappable.

Tested-by: Primoz Fiser <primoz.fiser@norik.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2023-12-08 11:24:07 +08:00
Icenowy Zheng
05735e9ff4 drm/verisilicon: fix cursor position
The cursor should be placed at (x + hot_x, y + hot_y) to allow partial
display of a cursor.

Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
2023-12-05 23:38:42 +08:00
Han Gao
d81a2398a8 nf: enable nf mangle
Signed-off-by: Han Gao <gaohan@iscas.ac.cn>
2023-12-03 15:33:55 +08:00
Han Gao
e133903e4d meles: add 4g/8g dts
Signed-off-by: Han Gao <gaohan@iscas.ac.cn>
2023-12-03 14:53:53 +08:00
Han Gao
8b4bca072c meles: fix: usb2.0
Signed-off-by: Han Gao <gaohan@iscas.ac.cn>
2023-12-03 14:53:53 +08:00
Haaland Chen
026ae08e53 riscv: dts: light: add Milk-V Meles board
Signed-off-by: Haaland Chen <haaland@milkv.io>
2023-12-03 14:53:53 +08:00
Han Gao
95a545985b riscv: default enable xtheadc
Signed-off-by: Han Gao <gaohan@iscas.ac.cn>
2023-11-29 23:46:21 +08:00
Han Gao
f722795a38 toolchains: fix mainline toolchain build
Signed-off-by: Han Gao <gaohan@iscas.ac.cn>
2023-11-29 23:46:21 +08:00
Han Gao
06bd593cfd ci: kernel auto build on thead-gcc & mainline-gcc
thead-gcc: v2.8.0
mainline-gcc: v13.2

Signed-off-by: Han Gao <gaohan@iscas.ac.cn>
2023-11-29 23:46:21 +08:00
Jisheng Zhang
556f057aca riscv: cmpxchg: implement arch_cmpxchg64_{relaxed|acquire|release}
After selecting ARCH_USE_CMPXCHG_LOCKREF, one straight futher
optimization is implementing the arch_cmpxchg64_relaxed() because the
lockref code does not need the cmpxchg to have barrier semantics. At
the same time, implement arch_cmpxchg64_acquire and
arch_cmpxchg64_release as well.

However, on both TH1520 and JH7110 platforms, I didn't see obvious
performance improvement with Linus' test case [1]. IMHO, this may
be related with the fence and lr.d/sc.d hw implementations. In theory,
lr/sc without fence could give performance improvement over lr/sc plus
fence, so add the code here to leave performance improvement room on
newer HW platforms.

Link: http://marc.info/?l=linux-fsdevel&m=137782380714721&w=4 [1]
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
2023-11-29 23:46:21 +08:00
Jisheng Zhang
186355454e riscv: select ARCH_USE_CMPXCHG_LOCKREF
Select ARCH_USE_CMPXCHG_LOCKREF to enable the cmpxchg-based lockless lockref
implementation for riscv.

Using Linus' test case[1] on TH1520 platform, I see a 11.2% improvement.
On JH7110 platform, I see 12.0% improvement.

Link: http://marc.info/?l=linux-fsdevel&m=137782380714721&w=4 [1]
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
2023-11-29 23:46:21 +08:00
Jisheng Zhang
b25b9b6b16 riscv: select ARCH_HAS_FAST_MULTIPLIER
Currently, riscv linux requires at least IMA, so all platforms have a
multiplier. And I assume the 'mul' efficiency is comparable or better
than a sequence of five or so register-dependent arithmetic
instructions. Select ARCH_HAS_FAST_MULTIPLIER to get slightly nicer
codegen. Refer to commit f9b4192923fa ("[PATCH] bitops: hweight()
speedup") for more details.

In a simple benchmark test calling hweight64() in a loop, it got:
about 14% preformance improvement on JH7110, tested on Milkv Mars.

about 23% performance improvement on TH1520 and SG2042, tested on
Sipeed LPI4A and SG2042 platform.

a slight performance drop on CV1800B, tested on milkv duo. Among all
riscv platforms in my hands, this is the only one which sees a slight
performance drop. It means the 'mul' isn't quick enough. However, the
situation exists on x86 too, for example, P4 doesn't have fast
integer multiplies as said in the above commit, x86 also selects
ARCH_HAS_FAST_MULTIPLIER. So let's select ARCH_HAS_FAST_MULTIPLIER
which can benefit almost riscv platforms.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Tested-by: Samuel Holland <samuel.holland@sifive.com>
2023-11-29 23:46:21 +08:00
Icenowy Zheng
35a32afaf8 Kernel: fix out-of-tree build for merged kernel modules
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
2023-11-29 12:54:48 +08:00
NekoRouter
3e042d29a3 Re-disable pwm,qspi0,qspi1 on beagle board
Keep light-a and beagle both can work
2023-10-22 21:04:53 -05:00
NekoRouter
c32ad7b836 Revert "sync: device-tree changes from main repo"
Enable pwm, qspi0, qspi1 on all devices

This reverts partial of commit 40ef3b0976
2023-10-22 21:04:53 -05:00