aboutsummaryrefslogtreecommitdiffstats
path: root/arch/powerpc/mm/hash_native_64.c
AgeCommit message (Collapse)AuthorFilesLines
2019-05-03powerpc/mm: Move book3s64 specifics in subdirectory mm/book3s64Christophe Leroy1-884/+0
Many files in arch/powerpc/mm are only for book3S64. This patch creates a subdirectory for them. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Update the selftest sym links, shorten new filenames, cleanup some whitespace and formatting in the new files.] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-04powerpc/64s/hash: Do not use PPC_INVALIDATE_ERAT on CPUs before POWER9Nicholas Piggin1-2/+2
PPC_INVALIDATE_ERAT is slbia IH=7 which is a new variant introduced with POWER9, and the result is undefined on earlier CPUs. Commits 7b9f71f974 ("powerpc/64s: POWER9 machine check handler") and d4748276ae ("powerpc/64s: Improve local TLB flush for boot and MCE on POWER9") caused POWER7/8 code to use this instruction. Remove it. An ERAT flush can be made by invalidatig the SLB, but before POWER9 that requires a flush and rebolt. Fixes: 7b9f71f974 ("powerpc/64s: POWER9 machine check handler") Fixes: d4748276ae ("powerpc/64s: Improve local TLB flush for boot and MCE on POWER9") Cc: stable@vger.kernel.org # v4.11+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-30powerpc: remove unnecessary inclusion of asm/tlbflush.hChristophe Leroy1-1/+0
asm/tlbflush.h is only needed for: - using functions xxx_flush_tlb_xxx() - using MMU_NO_CONTEXT - including asm-generic/pgtable.h Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-30powerpc: clean inclusions of asm/feature-fixups.hChristophe Leroy1-0/+1
files not using feature fixup don't need asm/feature-fixups.h files using feature fixup need asm/feature-fixups.h Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-24powerpc/mm/hash: Reduce contention on hpte lockAneesh Kumar K.V1-16/+33
We do this in some part. This patch make sure we always try to search for hpte without holding lock and redo the compare with lock held once match found. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-24powerpc/mm/hash: Add hpte_get_old_v and use that instead of opencodingAneesh Kumar K.V1-20/+7
No functional change Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-28Merge branch 'fixes' into nextMichael Ellerman1-1/+15
Merge our fixes branch from the 4.16 cycle. There were a number of important fixes merged, in particular some Power9 workarounds that we want in next for testing purposes. There's also been some conflicting changes in the CPU features code which are best merged and tested before going upstream.
2018-03-23powerpc/mm: Fixup tlbie vs store ordering issue on POWER9Aneesh Kumar K.V1-1/+15
On POWER9, under some circumstances, a broadcast TLB invalidation might complete before all previous stores have drained, potentially allowing stale stores from becoming visible after the invalidation. This works around it by doubling up those TLB invalidations which was verified by HW to be sufficient to close the risk window. This will be documented in a yet-to-be-published errata. Fixes: 1a472c9dba6b ("powerpc/mm/radix: Add tlbflush routines") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Enable the feature in the DT CPU features code for all Power9, rename the feature to CPU_FTR_P9_TLBIE_BUG per benh.] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-13powerpc/mm: Drop the function native_register_proc_table()Anshuman Khandual1-15/+0
This is left over from the segment table implementation and not getting called from any where now. Hence just drop it. Suggested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-18powerpc/64s: Improve local TLB flush for boot and MCE on POWER9Nicholas Piggin1-0/+97
There are several cases outside the normal address space management where a CPU's entire local TLB is to be flushed: 1. Booting the kernel, in case something has left stale entries in the TLB (e.g., kexec). 2. Machine check, to clean corrupted TLB entries. One other place where the TLB is flushed, is waking from deep idle states. The flush is a side-effect of calling ->cpu_restore with the intention of re-setting various SPRs. The flush itself is unnecessary because in the first case, the TLB should not acquire new corrupted TLB entries as part of sleep/wake (though they may be lost). This type of TLB flush is coded inflexibly, several times for each CPU type, and they have a number of problems with ISA v3.0B: - The current radix mode of the MMU is not taken into account, it is always done as a hash flushn For IS=2 (LPID-matching flush from host) and IS=3 with HV=0 (guest kernel flush), tlbie(l) is undefined if the R field does not match the current radix mode. - ISA v3.0B hash must flush the partition and process table caches as well. - ISA v3.0B radix must flush partition and process scoped translations, partition and process table caches, and also the page walk cache. So consolidate the flushing code and implement it in C and inline asm under the mm/ directory with the rest of the flush code. Add ISA v3.0B cases for radix and hash, and use the radix flush in radix environment. Provide a way for IS=2 (LPID flush) to specify the radix mode of the partition. Have KVM pass in the radix mode of the guest. Take out the flushes from early cputable/dt_cpu_ftrs detection hooks, and move it later in the boot process after, the MMU registers are set up and before relocation is first turned on. The TLB flush is no longer called when restoring from deep idle states. This was not be done as a separate step because booting secondaries uses the same cpu_restore as idle restore, which needs the TLB flush. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-23powerpc/powernv: Fix kexec crashes caused by tlbie tracingMahesh Salgaonkar1-3/+12
Rebooting into a new kernel with kexec fails in trace_tlbie() which is called from native_hpte_clear(). This happens if the running kernel has CONFIG_LOCKDEP enabled. With lockdep enabled, the tracepoints always execute few RCU checks regardless of whether tracing is on or off. We are already in the last phase of kexec sequence in real mode with HILE_BE set. At this point the RCU check ends up in RCU_LOCKDEP_WARN and causes kexec to fail. Fix this by not calling trace_tlbie() from native_hpte_clear(). mpe: It's not safe to call trace points at this point in the kexec path, even if we could avoid the RCU checks/warnings. The only solution is to not call them. Fixes: 0428491cba92 ("powerpc/mm: Trace tlbie(l) instructions") Cc: stable@vger.kernel.org # v4.13+ Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-07-02powerpc/mm: Wire up hpte_removebolted for powernvAnton Blanchard1-0/+33
Adds support for removing bolted (i.e kernel linear mapping) mappings on powernv. This is needed to support memory hot unplug operations which are required for the teardown of DAX/PMEM devices. Reviewed-by: Balbir Singh <bsingharora@gmail.com> Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-07-02powerpc: use spin loop primitives in some functionsNicholas Piggin1-1/+4
Use the different spin loop primitives in some simple powerpc spin loops, including those which will spin as a common case. This will help to test the spin loop primitives before more conversions are done. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Add some includes of <linux/processor.h>] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-23powerpc/mm: Trace tlbie(l) instructionsBalbir Singh1-0/+3
Add a trace point for tlbie(l) (Translation Lookaside Buffer Invalidate Entry (Local)) instructions. The tlbie instruction has changed over the years, so not all versions accept the same operands. Use the ISA v3 field operands because they are the most verbose, we may change them in future. Example output: qemu-system-ppc-5371 [016] 1412.369519: tlbie: tlbie with lpid 0, local 1, rb=67bd8900174c11c1, rs=0, ric=0 prs=0 r=0 Signed-off-by: Balbir Singh <bsingharora@gmail.com> [mpe: Add some missing trace_tlbie()s, reword change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-05powerpc/mm: Add missing global TLB invalidate if cxl is activeFrederic Barrat1-2/+5
Commit 4c6d9acce1f4 ("powerpc/mm: Add hooks for cxl") converted local TLB invalidates to global if the cxl driver is active. This is necessary because the CAPP snoops invalidations to forward them to the PSL on the cxl adapter. However one path was forgotten. native_flush_hash_range() still does local TLB invalidates, as found out the hard way recently. This patch fixes it by following the same logic as previously: if the cxl driver is active, the local TLB invalidates are 'upgraded' to global. Fixes: 4c6d9acce1f4 ("powerpc/mm: Add hooks for cxl") Cc: stable@vger.kernel.org # v3.18+ Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-24Merge branch 'topic/ppc-kvm' into nextMichael Ellerman1-6/+24
Merge the topic branch we're sharing with the kvm-ppc tree.
2016-11-16powerpc/64: Simplify adaptation to new ISA v3.00 HPTE formatPaul Mackerras1-6/+24
This changes the way that we support the new ISA v3.00 HPTE format. Instead of adapting everything that uses HPTE values to handle either the old format or the new format, depending on which CPU we are on, we now convert explicitly between old and new formats if necessary in the low-level routines that actually access HPTEs in memory. This limits the amount of code that needs to know about the new format and makes the conversions explicit. This is OK because the old format contains all the information that is in the new format. This also fixes operation under a hypervisor, because the H_ENTER hypercall (and other hypercalls that deal with HPTEs) will continue to require the HPTE value to be supplied in the old format. At present the kernel will not boot in HPT mode on POWER9 under a hypervisor. This fixes and partially reverts commit 50de596de8be ("powerpc/mm/hash: Add support for Power9 Hash", 2016-04-29). Fixes: 50de596de8be ("powerpc/mm/hash: Add support for Power9 Hash") Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-14powerpc/hash64: Be more careful when generating tlbielBalbir Singh1-4/+6
In ISA v2.05, the tlbiel instruction takes two arguments, RB and L: tlbiel RB,L +---------+---------+----+---------+---------+---------+----+ | 31 | / | L | / | RB | 274 | / | | 31 - 26 | 25 - 22 | 21 | 20 - 16 | 15 - 11 | 10 - 1 | 0 | +---------+---------+----+---------+---------+---------+----+ In ISA v2.06 tlbiel takes only one argument, RB: tlbiel RB +---------+---------+---------+---------+---------+----+ | 31 | / | / | RB | 274 | / | | 31 - 26 | 25 - 21 | 20 - 16 | 15 - 11 | 10 - 1 | 0 | +---------+---------+---------+---------+---------+----+ And in ISA v3.00 tlbiel takes five arguments: tlbiel RB,RS,RIC,PRS,R +---------+---------+----+---------+----+----+---------+---------+----+ | 31 | RS | / | RIC |PRS | R | RB | 274 | / | | 31 - 26 | 25 - 21 | 20 | 19 - 18 | 17 | 16 | 15 - 11 | 10 - 1 | 0 | +---------+---------+----+---------+----+----+---------+---------+----+ However the assembler also accepts "tlbiel RB", and generates "tlbiel RB,r0,0,0,0". As you can see above the L field from the v2.05 encoding overlaps with the reserved field of the v2.06 encoding, and the low bit of the RS field of the v3.00 encoding. Currently in __tlbiel() we generate two tlbiel instructions manually using hex constants. In the first case, for MMU_PAGE_4K, we generate "tlbiel RB,0", which is safe in all cases, because the L bit is zero. However in the default case we generate "tlbiel RB,1", therefore setting bit 21 to 1. This is not an actual bug on v2.06 processors, because the CPU ignores the value of the reserved field. However software is supposed to encode the reserved fields as zero to enable forward compatibility. On v3.00 processors setting bit 21 to 1 and no other bits of RS, means we are using r1 for the value of RS. Although it's not obvious, the code sets the IS field (bits 10-11) to 0 (by omission), and L=1, in the va value, which is passed as RB. We also pass R=0 in the instruction. The combination of IS=0, L=1 and R=0 means the value of RS is not used, so even on ISA v3.00 there is no actual bug. We should still fix it, as setting a reserved bit on v2.06 is naughty, and we are only avoiding a bug on v3.00 by accident rather than design. Use ASM_FTR_IFSET() to generate the single argument form on ISA v2.06 and later, and the two argument form on pre v2.06. Although there may be very old toolchains which don't understand tlbiel, we have other code in the tree which has been using tlbiel for over five years, and no one has reported any build failures, so just let the assembler generate the instructions. Signed-off-by: Balbir Singh <bsingharora@gmail.com> [mpe: Rewrite change log, use IFSET instead of IFCLR] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-09powerpc/mm: Speed up computation of base and actual page size for a HPTEPaul Mackerras1-40/+2
This replaces a 2-D search through an array with a simple 8-bit table lookup for determining the actual and/or base page size for a HPT entry. The encoding in the second doubleword of the HPTE is designed to encode the actual and base page sizes without using any more bits than would be needed for a 4k page number, by using between 1 and 8 low-order bits of the RPN (real page number) field to encode the page sizes. A single "large page" bit in the first doubleword indicates that these low-order bits are to be interpreted like this. We can determine the page sizes by using the low-order 8 bits of the RPN to look up a 256-entry table. For actual page sizes less than 1MB, some of the upper bits of these 8 bits are going to be real address bits, but we can cope with that by replicating the entries for those smaller page sizes. While we're at it, let's move the hpte_page_size() and hpte_base_page_size() functions from a KVM-specific header to a header for 64-bit HPT systems, since this computation doesn't have anything specifically to do with KVM. Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-08-04powerpc/mm: Move register_process_table() out of ppc_mdMichael Ellerman1-1/+1
We want to initialise register_process_table() before ppc_md is setup, so that it can be called as part of MMU init (at least on Radix ATM). That no longer works because probe_machine() requires that ppc_md be empty before it's called, and we now do probe_machine() much later. So make register_process_table a global for now. It will probably move into a mmu_radix_ops struct at some point in the future. This was broken by me when applying commit 7025776ed1eb "powerpc/mm: Move hash table ops to a separate structure" due to conflicts with other patches. Fixes: 7025776ed1eb ("powerpc/mm: Move hash table ops to a separate structure") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01powerpc/mm/hash: Add helper for finding SLBE LLP encodingAneesh Kumar K.V1-4/+2
Replace opencoding of the same at multiple places with the helper. No functional change with this patch. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21powerpc/mm: Move hash table ops to a separate structureBenjamin Herrenschmidt1-8/+8
Moving probe_machine() to after mmu init will cause the ppc_md fields relative to the hash table management to be overwritten. Since we have essentially disconnected the machine type from the hash backend ops, finish the job by moving them to a different structure. The only callback that didn't quite fix is update_partition_table since this is not specific to hash, so I moved it to a standalone variable for now. We can revisit later if needed. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [mpe: Fix ppc64e build failure in kexec] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-17powerpc/mm/radix: Update machine call back to support new HCALL.Aneesh Kumar K.V1-2/+8
This update the machine dep callback such that we can use the same callback to register process table. The interface is updated such that we can easily call H_REGISTER_PROC_TBL hcall. The HCALL itself is introduced in a later patch. No functionality change introduced by this patch. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-17powerpc/mm: Clear top 16 bits of va only on older cpusAneesh Kumar K.V1-2/+4
As per ISA, we need to do this only for architecture version 2.02 and earlier. This continued to work even for 2.07. But let's not do this for anything after 2.02. ISA 3.0 requires these top bits to be not cleared. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-15Merge tag 'powerpc-4.7-5' into nextMichael Ellerman1-4/+4
Pull in the fixes we sent during 4.7, we have code we want to merge into next that depends on some of them.
2016-06-14powerpc: Various typo fixesMichael Ellerman1-2/+2
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-14powerpc/mm/hash: Use the correct PPP mask when updating HPTEAneesh Kumar K.V1-4/+4
With commit e58e87adc8bf9 "powerpc/mm: Update _PAGE_KERNEL_RO" we now use all the three PPP bits. The top bit is now used to have a PPP value of 0b110 which will be mapped to kernel read only. When updating the hpte entry use right mask such that we update the 63rd bit (top 'P' bit) too. Prior to e58e87adc8bf we didn't support KERNEL_RO at all (it was == KERNEL_RW), so this isn't a regression as such. Fixes: e58e87adc8bf ("powerpc/mm: Update _PAGE_KERNEL_RO") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-08powerpc/mm/hash: Compute the segment size correctly for ISA 3.0Aneesh Kumar K.V1-1/+5
PowerISA 3.0 encodes the segment size in the second half of hash page table entry. Update hpte_decode() accordingly. Fixes: 50de596de8be ("powerpc/mm/hash: Add support for Power9 Hash") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01powerpc/mm/hash: Add support for Power9 HashAneesh Kumar K.V1-1/+10
PowerISA 3.0 adds a parition table indexed by LPID. Parition table allows us to specify the MMU model that will be used for guest and host translation. This patch adds support with SLB based hash model (UPRT = 0). What is required with this model is to support the new hash page table entry format and also setup partition table such that we use hash table for address translation. We don't have segment table support yet. In order to make sure we don't load KVM module on Power9 (since we don't have kvm support yet) this patch also disables KVM on Power9. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-14powerpc/mm: Move THP headers aroundAneesh Kumar K.V1-0/+10
We support THP only with book3s_64 and 64K page size. Move THP details to hash64-64k.h to clarify the same. Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-09powerpc: Fix checkstop in native_hpte_clear() with lockdepCyril Bur1-12/+11
native_hpte_clear() is called in real mode from two places: - Early in boot during htab initialisation if firmware assisted dump is active. - Late in the kexec path. In both contexts there is no need to disable interrupts are they are already disabled. Furthermore, locking around the tlbie() is only required for pre POWER5 hardware. On POWER5 or newer hardware concurrent tlbie()s work as expected and on pre POWER5 hardware concurrent tlbie()s could result in deadlock. This code would only be executed at crashdump time, during which all bets are off, concurrent tlbie()s are unlikely and taking locks is unsafe therefore the best course of action is to simply do nothing. Concurrent tlbie()s are not possible in the first case as secondary CPUs have not come up yet. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03cxl: Move include file cxl.h -> cxl-base.hMichael Neuling1-1/+1
This moves the current include file from cxl.h -> cxl-base.h. This current include file is used only to pass information between the base driver that needs to be built into the kernel and the cxl module. This is to make way for a new include/misc/cxl.h which will contain just the kernel API for other driver to use Signed-off-by: Michael Neuling <mikey@neuling.org> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-05powerpc/mm: don't do tlbie for updatepp request with NO HPTE faultAneesh Kumar K.V1-4/+11
upatepp can get called for a nohpte fault when we find from the linux page table that the translation was hashed before. In that case we are sure that there is no existing translation, hence we could avoid doing tlbie. We could possibly race with a parallel fault filling the TLB. But that should be ok because updatepp is only ever relaxing permissions. We also look at linux pte permission bits when filling hash pte permission bits. We also hold the linux pte busy bits while inserting/updating a hashpte entry, hence a paralle update of linux pte is not possible. On the other hand mprotect involves ptep_modify_prot_start which cause a hpte invalidate and not updatepp. Performance number: We use randbox_access_bench written by Anton. Kernel with THP disabled and smaller hash page table size. 86.60% random_access_b [kernel.kallsyms] [k] .native_hpte_updatepp 2.10% random_access_b random_access_bench [.] doit 1.99% random_access_b [kernel.kallsyms] [k] .do_raw_spin_lock 1.85% random_access_b [kernel.kallsyms] [k] .native_hpte_insert 1.26% random_access_b [kernel.kallsyms] [k] .native_flush_hash_range 1.18% random_access_b [kernel.kallsyms] [k] .__delay 0.69% random_access_b [kernel.kallsyms] [k] .native_hpte_remove 0.37% random_access_b [kernel.kallsyms] [k] .clear_user_page 0.34% random_access_b [kernel.kallsyms] [k] .__hash_page_64K 0.32% random_access_b [kernel.kallsyms] [k] fast_exception_return 0.30% random_access_b [kernel.kallsyms] [k] .hash_page_mm With Fix: 27.54% random_access_b random_access_bench [.] doit 22.90% random_access_b [kernel.kallsyms] [k] .native_hpte_insert 5.76% random_access_b [kernel.kallsyms] [k] .native_hpte_remove 5.20% random_access_b [kernel.kallsyms] [k] fast_exception_return 5.12% random_access_b [kernel.kallsyms] [k] .__hash_page_64K 4.80% random_access_b [kernel.kallsyms] [k] .hash_page_mm 3.31% random_access_b [kernel.kallsyms] [k] data_access_common 1.84% random_access_b [kernel.kallsyms] [k] .trace_hardirqs_on_caller Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-02powerpc/mm/thp: Use tlbiel if possibleAneesh Kumar K.V1-2/+2
If we know that user address space has never executed on other cpus we could use tlbiel. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-12-02powerpc/mm: Check for matching hpte without taking hpte lockAneesh Kumar K.V1-9/+15
With smaller hash page table config, we would end up in situation where we would be replacing hash page table slot frequently. In such config, we will find the hpte to be not matching, and we can do that check without holding the hpte lock. We need to recheck the hpte again after holding lock. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-11-03powerpc: Replace __get_cpu_var usesChristoph Lameter1-1/+1
This still has not been merged and now powerpc is the only arch that does not have this change. Sorry about missing linuxppc-dev before. V2->V2 - Fix up to work against 3.18-rc1 __get_cpu_var() is used for multiple purposes in the kernel source. One of them is address calculation via the form &__get_cpu_var(x). This calculates the address for the instance of the percpu variable of the current processor based on an offset. Other use cases are for storing and retrieving data from the current processors percpu area. __get_cpu_var() can be used as an lvalue when writing data or on the right side of an assignment. __get_cpu_var() is defined as : __get_cpu_var() always only does an address determination. However, store and retrieve operations could use a segment prefix (or global register on other platforms) to avoid the address calculation. this_cpu_write() and this_cpu_read() can directly take an offset into a percpu area and use optimized assembly code to read and write per cpu variables. This patch converts __get_cpu_var into either an explicit address calculation using this_cpu_ptr() or into a use of this_cpu operations that use the offset. Thereby address calculations are avoided and less registers are used when code is generated. At the end of the patch set all uses of __get_cpu_var have been removed so the macro is removed too. The patch set includes passes over all arches as well. Once these operations are used throughout then specialized macros can be defined in non -x86 arches as well in order to optimize per cpu access by f.e. using a global register that may be set to the per cpu base. Transformations done to __get_cpu_var() 1. Determine the address of the percpu instance of the current processor. DEFINE_PER_CPU(int, y); int *x = &__get_cpu_var(y); Converts to int *x = this_cpu_ptr(&y); 2. Same as #1 but this time an array structure is involved. DEFINE_PER_CPU(int, y[20]); int *x = __get_cpu_var(y); Converts to int *x = this_cpu_ptr(y); 3. Retrieve the content of the current processors instance of a per cpu variable. DEFINE_PER_CPU(int, y); int x = __get_cpu_var(y) Converts to int x = __this_cpu_read(y); 4. Retrieve the content of a percpu struct DEFINE_PER_CPU(struct mystruct, y); struct mystruct x = __get_cpu_var(y); Converts to memcpy(&x, this_cpu_ptr(&y), sizeof(x)); 5. Assignment to a per cpu variable DEFINE_PER_CPU(int, y) __get_cpu_var(y) = x; Converts to __this_cpu_write(y, x); 6. Increment/Decrement etc of a per cpu variable DEFINE_PER_CPU(int, y); __get_cpu_var(y)++ Converts to __this_cpu_inc(y) Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: Paul Mackerras <paulus@samba.org> Signed-off-by: Christoph Lameter <cl@linux.com> [mpe: Fix build errors caused by set/or_softirq_pending(), and rework assignment in __set_breakpoint() to use memcpy().] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-08powerpc/mm: Add hooks for cxlIan Munsie1-1/+5
This adds hooks into the core powerpc mm code for cxl. The core powerpc code sometimes uses local tlbie. Unfortunately this won't work with the current cxl driver as it relies on snooping tlbie broadcasts. The cxl hardware can have TLB entries invalidated via MMIO but this is not currently supported by the driver. In future we can make local tlbie smarter so that it invalidates cxl contexts via MMIO when it needs to but for now we have this workaround. This workaround checks for any active cxl contexts and if so, disables local tlbie. This also adds a hook for when SLBs are invalidated. This ensures any corresponding SLBs in cxl are also invalidated at the same time. This is required for segment demotion. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-08-13powerpc/thp: Invalidate with vpn in loopAneesh Kumar K.V1-16/+7
As per ISA, for 4k base page size we compare 14..65 bits of VA specified with the entry_VA in tlb. That implies we need to make sure we do a tlbie with all the possible 4k va we used to access the 16MB hugepage. With 64k base page size we compare 14..57 bits of VA. Hence we cannot ignore the lower 24 bits of va while tlbie .We also cannot tlb invalidate a 16MB entry with just one tlbie instruction because we don't track which va was used to instantiate the tlb entry. CC: <stable@vger.kernel.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13powerpc/thp: Don't recompute vsid and ssize in loop on invalidateAneesh Kumar K.V1-14/+5
The segment identifier and segment size will remain the same in the loop, So we can compute it outside. We also change the hugepage_invalidate interface so that we can use it the later patch CC: <stable@vger.kernel.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28powerpc/mm: Fix tlbie to add AVAL fields for 64K pagesAneesh Kumar K.V1-22/+16
The if condition check was based on a draft ISA doc. Remove the same. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-10-11powerpc: Book 3S MMU little endian supportAnton Blanchard1-20/+26
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc/mm: Use the correct SLB(LLP) encoding in tlbie instructionAneesh Kumar K.V1-2/+8
The sllp value is stored in mmu_psize_defs in such a way that we can easily OR the value to get the operand for slbmte instruction. ie, the L and LP bits are not contiguous. Decode the bits and use them correctly in tlbie. regression is introduced by 1f6aaaccb1b3af8613fe45781c1aefee2ae8c6b3 "powerpc: Update tlbie/tlbiel as per ISA doc" Reported-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24powerpc/mm: Fix fallthrough bug in hpte_decodeAneesh Kumar K.V1-0/+2
We should not fallthrough different case statements in hpte_decode. Add break statement to break out of the switch. The regression is introduced by dcda287a9b26309ae43a091d0ecde16f8f61b4c0 "powerpc/mm: Simplify hpte_decode" Reported-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powerpc: Optimize hugepage invalidateAneesh Kumar K.V1-0/+73
Hugepage invalidate involves invalidating multiple hpte entries. Optimize the operation using H_BULK_REMOVE on lpar platforms. On native, reduce the number of tlb flush. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powerpc/mm: handle hugepage size correctly when invalidating hpte entriesAneesh Kumar K.V1-77/+45
If a hash bucket gets full, we "evict" a more/less random entry from it. When we do that we don't invalidate the TLB (hpte_remove) because we assume the old translation is still technically "valid". This implies that when we are invalidating or updating pte, even if HPTE entry is not valid we should do a tlb invalidate. With hugepages, we need to pass the correct actual page size value for tlb invalidation. This change update the patch 0608d692463598c1d6e826d9dd7283381b4f246c "powerpc/mm: Always invalidate tlb on hpte invalidate and update" to handle transparent hugepages correctly. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-01powerpc/mm: Always invalidate tlb on hpte invalidate and updateAneesh Kumar K.V1-8/+22
If a hash bucket gets full, we "evict" a more/less random entry from it. When we do that we don't invalidate the TLB (hpte_remove) because we assume the old translation is still technically "valid". This implies that when we are invalidating or updating pte, even if HPTE entry is not valid we should do a tlb invalidate. This was a regression introduced by b1022fbd293564de91596b8775340cf41ad5214c Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-30powerpc: Update tlbie/tlbiel as per ISA docAneesh Kumar K.V1-2/+30
Encode the actual page correctly in tlbie/tlbiel. This make sure we handle multiple page size segment correctly. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-30powerpc: Fix hpte_decode to use the correct decoding for page sizesAneesh Kumar K.V1-31/+22
As per ISA doc, we encode base and actual page size in the LP bits of PTE. The number of bit used to encode the page sizes depend on actual page size. ISA doc lists this as PTE LP actual page size rrrr rrrz >=8KB rrrr rrzz >=16KB rrrr rzzz >=32KB rrrr zzzz >=64KB rrrz zzzz >=128KB rrzz zzzz >=256KB rzzz zzzz >=512KB zzzz zzzz >=1MB ISA doc also says "The values of the “z” bits used to specify each size, along with all possible values of “r” bits in the LP field, must result in LP values distinct from other LP values for other sizes." based on the above update hpte_decode to use the correct decoding for LP bits. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-30powerpc: Decode the pte-lp-encoding bits correctly.Aneesh Kumar K.V1-37/+98
We look at both the segment base page size and actual page size and store the pte-lp-encodings in an array per base page size. We also update all relevant functions to take actual page size argument so that we can use the correct PTE LP encoding in HPTE. This should also get the basic Multiple Page Size per Segment (MPSS) support. This is needed to enable THP on ppc64. [Fixed PR KVM build --BenH] Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-30powerpc: Use encode avpn where we need only avpn valuesAneesh Kumar K.V1-4/+4
In all these cases we are doing something similar to HPTE_V_COMPARE(hpte_v, want_v) which ignores the HPTE_V_LARGE bit With MPSS support we would need actual page size to set HPTE_V_LARGE bit and that won't be available in most of these cases. Since we are ignoring HPTE_V_LARGE bit, use the avpn value instead. There should not be any change in behaviour after this patch. Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-27powerpc: Remove tlb batching hack for nighthawkMichael Ellerman1-25/+1
In hpte_init_native() we call tlb_batching_enabled() to decide if we should setup ppc_md.flush_hash_range. tlb_batching_enabled() checks the _unflattened_ device tree, to see if we are running on a nighthawk. Since commit a223535 ("dont allow pSeries_probe to succeed without initialising MMU", Dec 2006), hpte_init_native() has been called from pSeries_probe() - at which point we have not yet unflattened the device tree. This means tlb_batching_enabled() will always return true, so the hack has effectively been disabled since Dec 2006. Ergo, I think we can drop it. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17powerpc/mm: Convert virtual address to vpnAneesh Kumar K.V1-49/+72
This patch convert different functions to take virtual page number instead of virtual address. Virtual page number is virtual address shifted right by VPN_SHIFT (12) bits. This enable us to have an address range of upto 76 bits. Reviewed-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17powerpc/mm: Simplify hpte_decodeAneesh Kumar K.V1-21/+28
This patch simplify hpte_decode for easy switching of virtual address to virtual page number in the later patch Reviewed-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05powerpc: Remove all includes of <asm/abs_addr.h>Michael Ellerman1-1/+1
It's empty now, apart from other includes. Fixup a few files that were getting things via this header. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-07-12KVM: PPC: book3s_hv: Add support for PPC970-family processorsPaul Mackerras1-1/+1
This adds support for running KVM guests in supervisor mode on those PPC970 processors that have a usable hypervisor mode. Unfortunately, Apple G5 machines have supervisor mode disabled (MSR[HV] is forced to 1), but the YDL PowerStation does have a usable hypervisor mode. There are several differences between the PPC970 and POWER7 in how guests are managed. These differences are accommodated using the CPU_FTR_ARCH_201 (PPC970) and CPU_FTR_ARCH_206 (POWER7) CPU feature bits. Notably, on PPC970: * The LPCR, LPID or RMOR registers don't exist, and the functions of those registers are provided by bits in HID4 and one bit in HID0. * External interrupts can be directed to the hypervisor, but unlike POWER7 they are masked by MSR[EE] in non-hypervisor modes and use SRR0/1 not HSRR0/1. * There is no virtual RMA (VRMA) mode; the guest must use an RMO (real mode offset) area. * The TLB entries are not tagged with the LPID, so it is necessary to flush the whole TLB on partition switch. Furthermore, when switching partitions we have to ensure that no other CPU is executing the tlbie or tlbsync instructions in either the old or the new partition, otherwise undefined behaviour can occur. * The PMU has 8 counters (PMC registers) rather than 6. * The DSCR, PURR, SPURR, AMR, AMOR, UAMOR registers don't exist. * The SLB has 64 entries rather than 32. * There is no mediated external interrupt facility, so if we switch to a guest that has a virtual external interrupt pending but the guest has MSR[EE] = 0, we have to arrange to have an interrupt pending for it so that we can get control back once it re-enables interrupts. We do that by sending ourselves an IPI with smp_send_reschedule after hard-disabling interrupts. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12powerpc, KVM: Split HVMODE_206 cpu feature bit into separate HV and ↵Paul Mackerras1-2/+2
architecture bits This replaces the single CPU_FTR_HVMODE_206 bit with two bits, one to indicate that we have a usable hypervisor mode, and another to indicate that the processor conforms to PowerISA version 2.06. We also add another bit to indicate that the processor conforms to ISA version 2.01 and set that for PPC970 and derivatives. Some PPC970 chips (specifically those in Apple machines) have a hypervisor mode in that MSR[HV] is always 1, but the hypervisor mode is not useful in the sense that there is no way to run any code in supervisor mode (HV=0 PR=0). On these processors, the LPES0 and LPES1 bits in HID4 are always 0, and we use that as a way of detecting that hypervisor mode is not useful. Where we have a feature section in assembly code around code that only applies on POWER7 in hypervisor mode, we use a construct like END_FTR_SECTION_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206) The definition of END_FTR_SECTION_IFSET is such that the code will be enabled (not overwritten with nops) only if all bits in the provided mask are set. Note that the CPU feature check in __tlbie() only needs to check the ARCH_206 bit, not the HVMODE bit, because __tlbie() can only get called if we are running bare-metal, i.e. in hypervisor mode. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2011-05-04powerpc: Use new CPU feature bit to select 2.06 tlbieMichael Neuling1-6/+4
This removes MMU_FTR_TLBIE_206 as we can now use CPU_FTR_HVMODE_206. It also changes the logic to select which tlbie to use to be based on this new CPU feature bit. This also duplicates the ASM_FTR_IF/SET/CLR defines for CPU features (copied from MMU features). Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27powerpc: Free up some CPU feature bits by moving out MMU-related featuresMatt Evans1-4/+4
Some of the 64bit PPC CPU features are MMU-related, so this patch moves them to MMU_FTR_ bits. All cpu_has_feature()-style tests are moved to mmu_has_feature(), and seven feature bits are freed as a result. Signed-off-by: Matt Evans <matt@ozlabs.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-19powerpc: Convert native_tlbie_lock to raw_spinlockThomas Gleixner1-7/+7
native_tlbie_lock needs to be a real spinlock in RT. Convert it to raw_spinlock. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-17powerpc: Convert open coded native hashtable bit lockAnton Blanchard1-3/+2
Now we have real bit locks use them instead of open coding it. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21powerpc: Add 2.06 tlbie mnemonicsMilton Miller1-2/+11
This adds the PowerPC 2.06 tlbie mnemonics and keeps backwards compatibilty for CPUs before 2.06. Only useful for bare metal systems. Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2007-10-12[POWERPC] Use 1TB segmentsPaul Mackerras1-46/+46
This makes the kernel use 1TB segments for all kernel mappings and for user addresses of 1TB and above, on machines which support them (currently POWER5+, POWER6 and PA6T). We detect that the machine supports 1TB segments by looking at the ibm,processor-segment-sizes property in the device tree. We don't currently use 1TB segments for user addresses < 1T, since that would effectively prevent 32-bit processes from using huge pages unless we also had a way to revert to using 256MB segments. That would be possible but would involve extra complications (such as keeping track of which segment size was used when HPTEs were inserted) and is not addressed here. Parts of this patch were originally written by Ben Herrenschmidt. Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-07-10[POWERPC] Move inline asm eieio to using eieio inline functionKumar Gala1-1/+1
Use the eieio function so we can redefine what eieio does rather than direct inline asm. This is part code clean up and partially because not all PPCs have eieio (book-e has mbar that maps to eieio). Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2007-06-14[POWERPC] Kill typedef-ed structs for hash PTEs and BATsDavid Gibson1-11/+11
Using typedefs to rename structure types if frowned on by CodingStyle. However, we do so for the hash PTE structure on both ppc32 (where it's called "PTE") and ppc64 (where it's called "hpte_t"). On ppc32 we also have such a typedef for the BATs ("BAT"). This removes this unhelpful use of typedefs, in the process bringing ppc32 and ppc64 closer together, by using the name "struct hash_pte" in both cases. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-06-14[POWERPC] Move common code out of if/elseJon Tollefson1-2/+1
Move common code out of if/else. Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com> ---- hash_native_64.c | 3 +-- 1 files changed, 1 insertion(+), 2 deletions(-) Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-12[POWERPC] Remove unused variable in hpte_decode()Stephen Rothwell1-1/+1
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-12[POWERPC] Assign correct variable in hpte_decode()Stephen Rothwell1-1/+1
This case will never be hit, but it should be corrected anyway. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-10[POWERPC] Fix warning in hpte_decode(), and generalize itPaul Mackerras1-20/+17
This adds the necessary support to hpte_decode() to handle 1TB segments and 16GB pages, and removes an uninitialized value warning on avpn. We don't have any code to generate HPTEs for 1TB segments or 16GB pages yet, so this is mostly for completeness, and to fix the warning. Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2007-05-07[POWERPC] 64K page support for kexecLuke Browning1-22/+62
This fixes a couple of kexec problems related to 64K page support in the kernel. kexec issues a tlbie for each pte. The parameters for the tlbie are the page size and the virtual address. Support was missing for the computation of these two parameters for 64K pages. This adds that support. Signed-off-by: Luke Browning <lukebrowning@us.ibm.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Olof Johansson <olof@lixom.net> Acked-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-04-13[POWERPC] Rename get_property to of_get_property: arch/powerpcStephen Rothwell1-1/+1
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-10-16[POWERPC] Make pSeries_lpar_hpte_insert staticGeoff Levand1-1/+1
Change the powerpc hpte_insert routines now called through ppc_md to static scope. Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-06-28[POWERPC] powerpc: Initialise ppc_md htab pointers earlierMichael Ellerman1-2/+1
Initialise the ppc_md htab callbacks earlier, in the probe routines. This allows us to call htab_finish_init() from htab_initialize(), and makes it private to hash_utils_64.c. Move htab_finish_init() and make_bl() above htab_initialize() to avoid forward declarations. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-06-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpcLinus Torvalds1-1/+1
* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (139 commits) [POWERPC] re-enable OProfile for iSeries, using timer interrupt [POWERPC] support ibm,extended-*-frequency properties [POWERPC] Extra sanity check in EEH code [POWERPC] Dont look for class-code in pci children [POWERPC] Fix mdelay badness on shared processor partitions [POWERPC] disable floating point exceptions for init [POWERPC] Unify ppc syscall tables [POWERPC] mpic: add support for serial mode interrupts [POWERPC] pseries: Print PCI slot location code on failure [POWERPC] spufs: one more fix for 64k pages [POWERPC] spufs: fail spu_create with invalid flags [POWERPC] spufs: clear class2 interrupt status before wakeup [POWERPC] spufs: fix Makefile for "make clean" [POWERPC] spufs: remove stop_code from struct spu [POWERPC] spufs: fix spu irq affinity setting [POWERPC] spufs: further abstract priv1 register access [POWERPC] spufs: split the Cell BE support into generic and platform dependant parts [POWERPC] spufs: dont try to access SPE channel 1 count [POWERPC] spufs: use kzalloc in create_spu [POWERPC] spufs: fix initial state of wbox file ... Manually resolved conflicts in: drivers/net/phy/Makefile include/asm-powerpc/spu.h
2006-06-17[PATCH] powerpc: Fix 64k pages on non-partitioned machinesArnd Bergmann1-2/+2
The page size encoding passed to tlbie is incorrect for new-style large pages. This fixes it. This doesn't affect anything on older machines because mmu_psize_defs[psize].penc (the page size encoding) is 0 for 4k and 16M pages (the two are distinguished by a separate "is a large page" bit). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-09[PATCH] powerpc: Fix buglet with MMU hash managementBenjamin Herrenschmidt1-1/+1
Our MMU hash management code would not set the "C" bit (changed bit) in the hardware PTE when updating a RO PTE into a RW PTE. That would cause the hardware to possibly to a write back to the hash table to set it on the first store access, which in addition to being a performance issue, might also hit a bug when running with native hash management (non-HV) as our code is specifically optimized for the case where no write back happens. Thus there is a very small therocial window were a hash PTE can become corrupted if that HPTE has just been upgraded to read write, a store access happens on it, and that races with another processor evicting that same slot. Since eviction (caused by an almost full hash) is extremely rare, the bug is very unlikely to happen fortunately. This fixes by allowing the updating of the protection bits in the native hash handling to also set (but not clear) the "C" bit, and, in order to also improve performances in the general case, by always setting that bit on newly inserted hash PTE so that writeback really never happens. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-24[PATCH] powerpc64: fix spinlock recursion in native_hpte_clearR Sharada1-1/+6
native_hpte_clear has a spinlock recursion problem with the native_tlbie_lock being called twice, once in native_hpte_clear() and once within tlbie(). Fix the problem by changing the call to tlbie() in native_hpte_clear() to __tlbie(). It still supports only 4k pages for now. Signed-off-by: R Sharada <sharada@in.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-06[PATCH] ppc64: support 64k pagesBenjamin Herrenschmidt1-147/+230
Adds a new CONFIG_PPC_64K_PAGES which, when enabled, changes the kernel base page size to 64K. The resulting kernel still boots on any hardware. On current machines with 4K pages support only, the kernel will maintain 16 "subpages" for each 64K page transparently. Note that while real 64K capable HW has been tested, the current patch will not enable it yet as such hardware is not released yet, and I'm still verifying with the firmware architects the proper to get the information from the newer hypervisors. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-10powerpc: Merge arch/ppc64/mm to arch/powerpc/mmPaul Mackerras1-0/+446
This moves the remaining files in arch/ppc64/mm to arch/powerpc/mm, and arranges that we use them when compiling with ARCH=ppc64. Signed-off-by: Paul Mackerras <paulus@samba.org>