Age | Commit message (Collapse) | Author | Files | Lines |
|
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Add flags support to the assembler version of push_section_type().
This also makes then SECTION_TYPE() be the assembler default if
the flags are empty. This is optional, but since we are enabling
SECTION_TYPE() to be defined for all code this lets us keep a
definition for it for C code, assembly, and assembler.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This lets us match the style of using using push_section_type for
asm volatile but for use in actual foo.S code. We keep the
SECTION_TYPE() declaration as well since both tables.h and ranges.h
use it for all code (C code, assembly, and assembler). We keep it
as we'll also enable next extending the assembler version of
push_section_type() with a flags option, SECTION_TYPE() on the
assembler version can be used for when the flags is empty.
While at it avoid requiring quotes for the level, this lets us
match all uses of the levels for all code.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This shows how we can add an architecture enabled optimization for
ps_shr(). This is currently disabled though as I'm sure the asm is
not right yet. Once the x86 asm is fixed in the header file
arch/x86/include/asm/ps_const.h just edit the top level Makefile and
uncomment:
#CFLAGS += -DCONFIG_HAVE_ARCH_PS_CONST
Expected synth driver output:
Synthetics: ps_shr(0xDEADBEEF, get_demo_shr) returns: 0x0000DEAD
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This is used on C code calling in to asm code via asm volatile().
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
With generic section type solutions in place we need a general
asm solution for declaring entries with asm. The first easy target
is to cover the C asm declarations, guard the header file for now
with __ASSEMBLY__ and define a first generic entry push_section_type()
to be used later for custom section type annotations.
Architectures can override. As suggested by hpa and later confirmed by
Heiko for the extreme corner case concern on s390, just \n should work
across the board [0].
[0] https://lkml.kernel.org/r/20160226145603.GA3964@osiris
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This enables asm cod eto also use SECTION_TBL(), if we want we can
later add an asm equivalent to DECLARE_SECTION_RANGE() but for tables,
but we leave this for later as we don't need it yet.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This demos the use of "partially static" variables, they like
consts, however we modify them early in boot time and provide
functions that do the handy work for us. The handy work of
processing them is accomplished through kernel's alternatives.
To this end we add an example simple alternatives processing
mechanism, its architecture is rather simple given it simply relies
on the linker table with structural data, and updates it as needed.
We demo a partially static function which shifts a variable to
the right by certain amount of times, the amount of times we
shift is determined through a function passed early in boot,
and as such can be dynamic.
Without alternatives processed you'd get:
Synthetics: ps_shr(0XDEADBEEF) returns: 0xDEADBEEF
After alternatives is processed you get:
Synthetics: ps_shr(0XDEADBEEF) returns: 0x0000DEAD
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This adds an asm version of DECLARE_SECTION_RANGE().
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This demos building a synthetic "or" function using section
ranges in asm. The file or.S consists of completely unordered pieces of
code, each doing different things and each also pegged onto separate
the same section, however specified with a specific order level.
The different pieces of unordered code illustrates how we'd expect code to
be gathered and compiled, each piece of code could realiastically be in
separate files. We add some #ifdefs with CONFIG_* options, we could also
expect that some of the code might be present and some of it might not
while it all still works together.
We'll later add support for building these pieces of code separately.
The linker stiches these pieces of code for us using the levels for
order, in the end it builds a single concise function which then runs
orderly.
Upon running this we get a series of variables OR'd together:
Synthetics: synth_init_or(2) returns: 0xDEADBEEF
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This simplifes the section names. Requested by hpa.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This does not need to use __typeof__(), since we have
the name just refer to it directly. There are quite
a few users of this style in the kernel already as well,
example, one example: net/netfilter/xt_repldata.h
It doesn't seem worth generalizing this into
include/linux/compiler-gcc.h as we're providing all our
attributes in one shot.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Upon review with hpa the second pass issues with ld should be
gone if the section ranges and linker tables are used as the custom
linker script is not going to be relied upon for the alignment. The
2nd pass issue was due to the fact that the compiler and linker might
have different strategies for figuring what the proper alignment
should be, and there can be a disconnect between what the compiler
assumes and what the linker ends up also deciding to use.
By using section ranges and linker tables that disconnect is no longer
there, as we are relying on the compiler exclusively now for alignment.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This is being discussed:
https://lkml.kernel.org/r/CAB=NE6W6Eh1tU=NXf20f-CPYeT22mtXGzrYqM9W-f14dGBrEsA@mail.gmail.com
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
As per hpa, an order level can be highly desirable for section
ranges, for instance to build a synthetic function. We'll try to
add an example of a synthetic function later.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
That was the original intention.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
To support an asm version of SECTION_RANGE(), add a respective asm
version of SECTION_TYPE().
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Upon review with hpa its better to have the shorter version for
what we expect linker tables to be used for mostly, and that is data.
DECLARE_LINKTABLE_TEXT() is then required for .text.
Expand documentation on that we have then only two declarations
also:
o DECLARE_LINKTABLE() for data
o DECLARE_LINKTABLE_RO() for read-only data
The definitions are used to associate these with actual sections
on the code that implements support for the table.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
A pending patch suggests copyleft-next GPLv2 compatible,
it integrates the license as the list of possible licenses
used on modules and files [0] in the Linux kernel. While
integration of this patch is pending, relicense this userspace
code to copyleft-next in the meantime. copyleft-next is my
license of choice.
The current template license on tables.h is temporary while
its sorted out what is best.
As noted in the proposed patch upstream to add copyleft-next to
the list of kernel compatible licenses [0] but added here to explain
*why* I've decided to do this, a summary of benefits of using
copyleft-next >= 0.3.1 over GPLv2 is listed below, it shows *why*
some folks like myself will prefer it over GPLv2 for future work.
An obvious gain not in that list is that this particular userspace
code is now under copyleft-next, so while GPLv2 applies to the Linux
kernel we benefit from keeping this code under copyleft-next if
future enhancements are made upstream on the Linux kernel.
o It is much shorter and simpler
o It has an explicit patent license grant, unlike GPLv2
o Its notice preservation conditions are clearer
o More free software/open source licenses are compatible
with it (via section 4)
o The source code requirement triggered by binary distribution
is much simpler in a procedural sense
o Recipients potentially have a contract claim against distributors
who are noncompliant with the source code requirement
o There is a built-in inbound=outbound policy for upstream
contributions (cf. Apache License 2.0 section 5)
o There are disincentives to engage in the controversial practice
of copyleft/ proprietary dual-licensing
o In 15 years copyleft expires, which can be advantageous
for legacy code
o There are explicit disincentives to bringing patent infringement
claims accusing the licensed work of infringement (see 10b)
o There is a cure period for licensees who are not compliant
with the license (there is no cure opportunity in GPLv2)
o copyleft-next has a 'built-in or-later' provision
[0] http://lkml.kernel.org/r/1465929311-13509-1-git-send-email-mcgrof@kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
At this point we've diverged so much this is clearly
and obviously a full re-write.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Quite a few generalizations between section tables
and ranges:
o Replace the *SECTION_TBL*() macros with generic
*SECTION_TYPE*() macros. This lets us be more agnostic
between the linker tables and section ranges and also
lets us add other possible section types in the future
in a more generic fashion.
o Generalize the arbitrary order of "any" as SECTION_ORDER_ANY,
any special section type not needing order can use this. The
section ranges are an example.
o Keep sort for section ranges on vmlinux.lds.S, otherwise
the start and end declaration will not work. By keeping the
sort and using "any" for in between items we're able to extend
section ranges without modifying the linker script.
o Rename the special section for tables from .tbl to .tables
o Use the special section for ranges as .ranges.
o Require SECTION_TYPE_* name declarations for each new type
of special section type. We start off with two:
- SECTION_TYPE_RANGES this is for .tables
- SECTION_TYPE_TABLES this is for .ranges
These are postfixes for each special section type. By
using these we can easily modify the special section
postfix in header files in one place.
o Due to the very explicit divide between section ranges and
section tables we now need a separate entry for each in vmlinux.lds.S
but we accept this for now as it helps with semantics in separation.
Technically we could also just shove both section ranges and
section tables under its own special section prefix, perhaps
.types and then we can move both section ranges and section tables
underneath, so we'd have, for example for .text:
.text
.text.types.*
.text.types.tables
.text.types.ranges
While this saves us the need to add a new special section type
under vmlinux.lds.S we may loose the ability to sort tables.* and/or
we are forcing a sort for *all* special section types, thereby slowing
down link time.
o Since we modified tons of section stufff, we needed to fix the
'make clean' as otherwise we are left with old objects using
the old .tbl section names.
Tons of bike shedding opportunities here! We let the fun bike shedding
discussions happen openly with the community.
Run time tested:
Initializing x86 bare metal world
x86-init: Number of init entries: 7
Initializing kasan ...
Early init for Kasan...
Completed initializing kasan !
Initializing memory ...
Completed initializing memory !
Initializing kprobes ...
== OK: test_kprobe_0001 within range!
== OK: test_kprobe_0002 not in range as expected!
Completed initializing kprobes !
Initializing pci ...
PCI fixup size: 1
Demo: Using LINKTABLE_FOR_EACH
foo_fixup
Demo: Using LINKTABLE_RUN_ALL
foo_fixup
Completed initializing pci !
Initializing beta ...
Completed initializing beta !
Initializing alpha ...
Completed initializing alpha !
Booting bare metal
Calling start_kernel()...
ACME: Initializing ...
ACME: Finished init ... !
ACME: Running scheduled work
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This ports the the sub-section .sched.text to use
the more standard section range interface, doing so
avoids mucking with the linker script and pushes us
to explicitly annotate the standard section where this
belongs, all in C code and header files.
The section range used, in this case "sched_text", is
arbitrary but we needs to be unique within the defined
section used, in this case SECTION_TEXT cannot have any
oclaredther "sched_text" section ranges.
We now have:
$ readelf -S kernel/locking/mutex.o | grep sched
[ 6] .text.tbl.sched_t PROGBITS 0000000000000000 000000b0
[ 9] .text.tbl.sched_t PROGBITS 0000000000000000 00000138
[10] .text.tbl.sched_t PROGBITS 0000000000000000 00000138
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Also use this in the linker script. This is the first
part of porting .sched.text to standardized section ranges.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
We move mutex calls to its own sub-section, underneath .text.
We'll next try to use the standard section work we've been doing
to see how this will fit as sub-sections within the framework.
readelf -S kernel/locking/mutex.o | grep sched
[ 6] .sched.text PROGBITS 0000000000000000 000000b0
[ 7] .rela.sched.text RELA 0000000000000000 00001368
When we run ./main:
Initializing x86 bare metal world
x86-init: Number of init entries: 7
Initializing kasan ...
Early init for Kasan...
Completed initializing kasan !
Initializing memory ...
Completed initializing memory !
Initializing kprobes ...
== OK: test_kprobe_0001 within range!
== OK: test_kprobe_0002 not in range as expected!
Completed initializing kprobes !
Initializing pci ...
PCI fixup size: 1
Demo: Using LINKTABLE_FOR_EACH
foo_fixup
Demo: Using LINKTABLE_RUN_ALL
foo_fixup
Completed initializing pci !
Initializing beta ...
Completed initializing beta !
Initializing alpha ...
Completed initializing alpha !
Booting bare metal
Calling start_kernel()...
ACME: Initializing ...
ACME: Finished init ... !
ACME: Running scheduled work
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Workqueues use mutex_lock() behind the scenes, which we'll
soon port to use its own section matching the Linux
kernel style.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This was using the x86 init tables, that's for x86.
Move this stuff to the proper Linux kernel built-in
driver infrastructure.
Note that this now loads during start_kernel().
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
There are no drivers yet, that's next.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This updates the x86 init tables to account for some initial
feedback after the v2 series. In particular, some highlights:
o removed run time sorting: this is not needed yet
o since we now have only 2 declarers:
DECLARE_LINKTABLE() -- const
DECLARE_LINKTABLE_DATA() - non const
We use use DECLARE_LINKTABLE_DATA() for the table, and the definition
actually pegs it to the right section.
o we drop x86_init_fn_setup_arch() and x86_init_fn_late_init()
as this is better evolved with time with upstream. The point
is made and I think its understood what the goal is here.
o we drop the char *name from the struct x86_init_fn given to
match the upstream kernel use. The kernel can get the name
using %pF, but we can't since libc stdio printf does not support
this. Maybe we can port this somehow in... but I can't see how.
For now just comment out %pF uses and deal with it by cluttering
our init calls so we know what triggered where.
For instance the kernel can use but we'll comment this sort of
stuff out:
pr_err("Init sequence fails to declares any supported subarchs: %pF\n", fn->early_init);
o The X86_INIT_EARLY_*() macros are simplified to only provide what we
need, we also use lower case, so x86_init_early_pc(), etc.
o The xen macro x86_init_early_xen() is added but won't be part of the
v3 upstream submission. That will later be used but we have
to re-consider a different level order for it, perhaps a hypervisor
order level. That can only happen *iff* we can get load_idt() issue
addressed. As it stands we can't use the subarch on x86_64_start_kernel()
prior to load_idt(), our userspace demo here uses it as that's the
goal but in practice this doesn't work yet on x86 kernels.. we can
only access the subarch after load_idt(), so upstream v3 submission
will use the x86_init_fn_early_init() on x86_64_start_reservations().
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This lets us carry code more as-is from upstream.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
No functional changes yet.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This adds the references to enable module initialization
but only with built-in support.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
We'll use this later for initial module support.
Note this goes with linker table support. It should
give an idea of how we can port proper Linux kernel
init levels to linker table.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Wrap Linux workqueue implementation with pthreads.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
We use pthread_spinlock_t with PTHREAD_PROCESS_SHARED.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
We'll later use this to peg __sched, which will be our
first example subsection to test with linker tables.
Before we do the work to pget mutext to __sched we'll
add a bit more kernel interface support to make things
a bit more interesting and useful to test.
This a simple mutex implementation which just wraps around
the pthread pthread_mutex_t.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Found a nugget around SCHED_TEXT documentation on
include/asm-generic/vmlinux.lds.h upstream.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
In practice max structural alignment breaks iteration.
Using functional alignment for each new item works well
though. We can hash out what is best with the community.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
The alignment values come from include/asm-generic/vmlinux.lds.h
through lessons picked up from STRUCT_ALIGN ALIGN_FUNCTION.
Note that his doesn't work though ! Make record of this and
make the respective functional changes next. Deciding how we
align can then be punted to the community.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
I forgot to add these, and fix a few compile errors.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This adds a small demo of sectoin ranges matching the style
used in the Linux kernel for kprobes. The only use case we give
and test here is being able to annotate routines as part of
the kprobe section, or not. We test a __kprobes routine, and
then also test another routine without the __kprobes annotation.
To support section ranges properly I noticed we needed to also
extend ranges.h with a proper:
SECTION_RANGE_BEGIN()
SECTION_RANGE_END()
These are used for the DEFINE_SECTION_RANGE(), which is used
to require the user to specify the specific section the custom
section range being defined belongs to.
The next required set of tests would be to test sub-sections...
Its not clear what examples there are for these yet though.
At least I have not seen them in
Results:
Initializing x86 bare metal world
Number of init entries: 8
Initializing kasan ...
Early init for Kasan...
Completed initializing kasan !
Initializing memory ...
Completed initializing memory !
Initializing kprobes ...
== OK: test_kprobe_0001 within range!
== OK: test_kprobe_0002 not in range as expected!
Completed initializing kprobes !
Initializing pci ...
PCI fixup size: 1
Demo: Using LINKTABLE_FOR_EACH
foo_fixup
Demo: Using LINKTABLE_RUN_ALL
foo_fixup
Completed initializing pci !
Initializing beta ...
Completed initializing beta !
Initializing alpha ...
Completed initializing alpha !
Initializing acme ...
Completed initializing acme !
Booting bare metal
Calling start_kernel()...
Running setup_arch for kasan ...
Calling setup_arch work for Kasan...
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
There are two ways custom ELF sections are used:
a) text "ranges":
Collection of executable routines, this use varies but one
example is the kernel's kprobes. Since kprobes uses two
custom sections, to be specific this references the use
case for .kprobes.text -- these are currently rolled into
the kernel's INIT_DATA in include/asm-generic/vmlinux.lds.h
and used as INIT_DATA_SECTION(16) on x86 under
arch/x86/kernel/vmlinux.lds.S
b) tables:
Custom data structures stitched together
We split off linker tables into 3 sections:
I. include/linux/sections.h
II. include/linux/tables.h
III. include/linux/ranges.h
The sections.h already existed, we just keep stuffing in there
generic helpers which allow us to streamlne customizing Linux
sections.
Motivation for this is as per hpa [0]:
-----------------------------------------------------------------------
1. priority ordering doesn't make any sense for ranges.
2. ranges can be hierarchial, that is, range "bar" can be entirely
inside range "foo".
3. ranges aren't typed (although in C, that pretty much means they are
"char" or "unsigned char" as there really isn't any way to define an
"array of void".)
4. the only useful operator on a range is "is address X inside this
range"; this operator is likely *not* useful for a table, since
if you have to ever invoke it you are probably doing something very
wrong.
For this to work, we need strings such that they will always sort in the
appropriate order with the bracket symbols around subranges.
-----------------------------------------------------------------------
This is an initial attempt to provide support for this. We do this
by first acknowledging that that both section ranges and linker tables
need common building blocks, and then granting the option to declare
either a section range, or letting you customize this further
with a custom data structure. This at least addresses the conceptual
division between section ranges and linker tables.
No section range support test case is provided here, although upstream
this would be used for the .kprobes.text.
We'll next need to try to demo an example with subsections.
[0] http://lkml.kernel.org/r/56CCE9B6.4080000@zytor.com
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This is just in case our demo uses them, we place them
in functional equivalents, given we never free or do
sanity checking, which is one of the ways the refs are
used for.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
The _TEXT postfix is removed to since SECTION_INIT also
implicates .init.text -- stick the convention to default
to .text unless otherwise noted.
References for this decision:
http://lkml.kernel.org/r/20160224005458.GK25240@wotan.suse.de
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Be a bit more pedantic about *.text* based sections
and mark them as read-only by using const.
Note that this does not stuff further things into
.rodata, we've already done that, the section mapping
for Linux sections is a separate topic, and as far
as this effort is concerned, its current goal is simply
only to use the exact same mappings it alread head.
We'll just be standardizing the maming scheme used
to later help identify clearly what proper Linux section
a custom section belongs to, without having to modify
the custom linker script.
If we discover in the process of converting over custom
section code to linker tables that they deserve to be
in a different section we'll do that as a separate step.
If one wants to inspect the kernel's .rodata, one
can look at include/asm-generic/vmlinux.lds.h's
definition of RO_DATA_SECTION(align). Then also
consider these:
#define RODATA RO_DATA_SECTION(4096)
#define RO_DATA(align) RO_DATA_SECTION(align)
At least x86 uses RO_DATA(PAGE_SIZE). For documentation
completeness note that .init.rodata is rolled into
INIT_DATA as per include/asm-generic/vmlinux.lds.h, however
for this userspace demo we're pegging it into .rodata
proper as userspace has no .init that we'll later free
at run time.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This adds the sections.h file to allow easy association of
custom sections to standard ELF / Linux sections. It also
goes ahead and ports over all of the existing old .tbl
linker table code to the new linker table format. In order
to do this, we try to keep the code to match as much as
possible with the RFC v2 submission made [0]. Just for
references, you can review the discussion of the RFC v1
series as well [1].
Other relevant changes:
* It drops documentation in favor of that being done upstream,
this removes the clutter while we just work on the proof
of concept code. If this gets upstream then we'll just sync
the tables.h from upstream to this tree. For now this is just
a bit of a pain in the ass in between submissions.
* Extends pci.c with an example using LINKTABLE_FOR_EACH()
and also LINKTABLE_RUN_ALL().
* Extends include/linux/kernel.h with things we need to make
tables.h compile in userspace -- these are things we don't need.
Expected output:
$./main
Initializing x86 bare metal world
Number of init entries: 7
Initializing kasan ...
Early init for Kasan...
Completed initializing kasan !
Initializing memory ...
Completed initializing memory !
Initializing pci ...
PCI fixup size: 1
Demo: Using LINKTABLE_FOR_EACH
foo_fixup
Demo: Using LINKTABLE_RUN_ALL
foo_fixup
Completed initializing pci !
Initializing beta ...
Completed initializing beta !
Initializing alpha ...
Completed initializing alpha !
Initializing acme ...
Completed initializing acme !
Booting bare metal
Calling start_kernel()...
Running setup_arch for kasan ...
Calling setup_arch work for Kasan...
$ ./main -x
Initializing Xen guest
Number of init entries: 7
Initializing memory ...
Completed initializing memory !
Initializing pci ...
PCI fixup size: 1
Demo: Using LINKTABLE_FOR_EACH
foo_fixup
Demo: Using LINKTABLE_RUN_ALL
foo_fixup
Completed initializing pci !
Initializing xen_driver ...
Completed initializing xen_driver !
Booting a Xen guest
Calling start_kernel()..
[0] http://lkml.kernel.org/r/1455889559-9428-1-git-send-email-mcgrof@kernel.org
[1] http://1450217797-19295-1-git-send-email-mcgrof@do-not-panic.com
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
We want to grow arch/x86/kernel/vmlinux.lds.S with macros
to allow us to standardize on ELF sections, and custom
Linux sections. This lets us clearly associate custom
ELF sections we add to specific standard ELF / and
standard Linux sections.
The standardizaiton of the sections will be done later. After
this vmlinux.lds.S will #include <linux/sections.h>, but in
order to support this we must ask gcc to preprocess the
vmlinux.lds.S for us.
As is standard in the Linux kernel (in scripts/Makefile.build)
this also defines __ASSEMBLY__ LINKER_SCRIPT to allow us to
ifdef out code that we know will barf on asm code or the in
our case, the linker script.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Based on feedback here's a v2 style for linker tables with
more Linux coding style in mind. This goes with an example
of how to use it.
For now we don't remove the old linker table solution.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
To replicate what we'd use upstream on Linux.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
To sync this tree to your Linux tree just call this script
from your Linux tree. At least init.c compiles now.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
init.c now compiles as-is on Linux.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This is how it is upstream.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Makes it easier for copying over.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This should match the namespace used upstream.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Again, another thing to make it easier to test / exchange code
with upstream as it evolves.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This will make it easier to place code from this
mockup code upstream, and likewise to later update
this mockup project with upstream code.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
That's closer to what this should look like upstream.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
The tables used within x86's init.h were still prone to
error if other subsystems were to use this. This makes it
very x86 specific. The next step is to pick a file name
we'd actually use upstream.
Stick to only 3 basic levels for now:
X86_INIT_ORDER_EARLY 10
X86_INIT_ORDER_NORMAL 30
X86_INIT_ORDER_LATE 50
I cannot see why we'd need so many levels, simple ordering through
placement on C files and the Makefile should suffice for most needs.
The order level should only really be of useful if you have problems
to order code in C files or Makefiles.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
The custom.lds.S was still working with a generic table,
update that to make it clear how one can work with different
subsystems. This deviates from gPXE's own solution slightly
in that each table needs to be declared separately. That
works better for Linux given we're combining the IOMMU init
solution as well which has its own sorting options with
memmove(). It also works better for Linux given the size
of Linux and how an issue on one subsystem could affect
others. We want to compartamentalize each table solution
separateley.
While at it, revamp tables.h with our own Linux documentation.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
The PCI order level was normal, while the driver was early.
The new order level checker picks this up, fix this.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Expanding on our effort to defining strong semantics lets be a bit
more pedantic over the order level, what gpxe calls the table
sub-index.
The ordering part of the section name can't be used on C code yet
to validate that a subordinate init routine order level unless has a
less order level. To help with this add macros to enable setting the
order level, also add an order level as part of the init sequence to
enable sanity checking of the order level at run time.
This also enables semantic parsers to be used to vet for sorting
semantics on order level at build time, reduces overal code,
and also makes it easier to modify the use of annotation as needed
without affecting users.
With this we get this error:
./main
Initializing x86 bare metal world
INVALID ORDER LEVEL! acme should have an order level <= be called before pci!
Number of init entries: 7
Initializing kasan ...
Early init for Kasan...
Completed initializing kasan !
Initializing memory ...
Completed initializing memory !
Initializing beta ...
Completed initializing beta !
Initializing alpha ...
Completed initializing alpha !
Initializing pci ...
Completed initializing pci !
Initializing acme ...
Completed initializing acme !
Booting bare metal
Calling start_kernel()...
Running setup_arch for kasan ...
Calling setup_arch work for Kasan...
Next we'll fix this.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
The x86 init path is critical, if anything fails in it things can
go wrong fast. To avoid issues since we are defining a small framework
for how to run routines, in what order, dependencies, and required
work, we need to define very strongs semantics. We start off with
defining semantics through documentation by considering all possible
combinations of uses of the init table framework.
While at it, rename init_fn to x86_init_fn to reflect how this is
specific to x86. This means the sorting routine is specific to x86
as well, as is all other calls. This is intentional, we are customizing
the table work for our own needs on x86.
First a clarification of the x86 boot sequence:
Bare metal, KVM, Xen HVM Xen PV / dom0
startup_64() startup_xen()
\ /
x86_64_start_kernel() xen_start_kernel()
\ /
x86_64_start_reservations()
|
start_kernel()
[ ... ]
[ setup_arch() ]
[ ... ]
init
A few highlights of pedantic semantic changes worth highlighting:
* We are meshing the gpxe and IOMMU specialized init solutions,
the gpxe solution works with a linker sort, the IOMMU init
solution has its own C sorter which memmove()'s routines
around, this not only uses the "depends" callback for sorting
purposes but also uses memmove() routines around depending on
sorting heuristics. This makes sorting specific to the subsystem
init structures, and semantics defined.
Two possible paths we need to pick from this:
a) We can make a generic library sort routine, which takes as input
only the subsystem specific table init first entry and last
entry. Doing this generalizes the sorting algorithm but
imposes a common shared set of semantics or -- imposes more
requirements on our sorting routines to take more input, which
would not only enable understanding sematnics but also the size
of the structure as memmove() is used to shuffle things around.
b) We just keep sorting routines specific to each subsystem. In this
x86 mockup we go with this solution.
Since I've opted to go with option b) I modified the sort routines,
the linker script to use subsystem specific table start and end
points.
* Clarified the semantic gap caused by current hypervisors between
x86 Linux entry point and pv_ops and how subarchitecture fills
the gap. Refer to init.h for details.
* Clarified order level use:
- Clarified that when considering two init sequences with the same
order level the next thing which considers order is placement
on the C file and next Makefile. Note that SORT() is still used by the
linker but the sort is specific to the order subsystem specific
structures alone and the function names play no role in sortin. Refer to
custom.lds.S -- *(SORT(.tbl.init_fns.*)) is used. The name of the struct
is not considered for sorting. For instance:
$ readelf -S kasan.o| grep tbl
[ 8] .tbl.init_fns.01 PROGBITS 0000000000000000 000000a0
[ 9] .rela.tbl.init_fn RELA 0000000000000000 00000f00
The sort here is going to be performed on the .tbl.init_fns.*, the
order level attributes.
- The 2 digit order level is used to help sort init routines
by batched levels. Since an init sequence can also depend on
another init sequence the order level of the subordinate
init sequence must be less than or equal to the init sequence it
depends on.
Since the ordering is part of the section name we can't use it on
C code to validate that the subordinate order level unless we use
macros to enable setting the order level, and as part of that a
new member part of the init sequence. Coccinelle cannot use
C annotations, its just not part of the abstract syntax tree in
Coccinelle, as such we can't easily get access to them for validation,
however Coccinelle does understand macro declarations
(ie, DEFINE_MUTEX(), through declarer name DEFINE_MUTEX;) so another
prospect to enforce proper order level could be to annotate order
level with a respective macro declarations and an SmPL rule for
validation.
We can accomplish only similar enforcement checking at run time
by pegging the order level as part of the init structure.
Changes will be made next to define and use macros to enable declarations
of x86 init sequences. This will enables semantic parsers to be used to
vet for sorting semantics on order level, reduces code, and also makes
it easier to modify the use of annotation as needed without affecting
users.
* IOMMU init framework uses an int for the return type for the
detect and depends callback. This seems error-prone, I've
changd this to simply be of return type bool.
* Although using an int return type for early_init() and late_init()
might seem appealing its not what we use for the Linux x86 init
boot code, all code is critical and cannot fail. As such I've removed
the critical declaration option from the struct x86_init_fn -- to
match Linux. In the future though, this commit could be referenced
for example code if one wanted to add routines which could
optionally fail, and still allow strong semantics for dependencies
between init sequences.
* If an init sequence has X86_SUBARCH_XEN only it must mean
mean then only that the early_init() callback would be run on
Xen HVM currently. Note that this would mean that the calls are made
right before x86_64_start_reservations(). As it stands typically you
would put this sort of routine within the xen_start_kernel() work
flow, this not only enables us to define xen early init calls but also
ultimatley would enable us to fold the two separate x86 entry points as
one if we so desired.
* use u32 for x86_init_fn flags, document what they are.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
A typical issue when introducing new x86 features that require
early init code modifications is not handling the different
x86 supported subarchitectures. Code review typically addresses
this, but regressions are known to have escaped code review, one
example is cr4 shadow setup [0] which on Xen caused crashes,
another more current was kasan [1], this later one is still known
to be broken for Xen and will crash Xen on bootup when enabled.
A typical knee-jerk reaction to this problem is to simply disable
the feature through Kconfig when certain conflicting subarchitectures
are compiled in, this however restricts the kernel binary. To address
this at run time we can require feature subarchitecture capabilities
annotated. By requiring these annotations we can avoid redundant
code checks that would otherwise have handled incompatibitlies in
different code segements for a feature, but more importantly this
also forces clear developer awareness over required subarchitecture
consideations upon development. As with the asm-generic solutions which
enable generic architecture feature code to advance without requiring
all architectures to implement required components, we enable x86
features to advance as different subarchitectures are supported.
A subarch feature detection mechanism also allows us to clean up
the feature depends() callback through the IOMMU init development
solution considerably.
[0] 5054daa285bea ("x86/xen: Initialize cr4 shadow for 64-bit PV(H) guests")
[1] https://lkml.kernel.org/r/CAB=NE6Xs5fepzNtymzT4CueeJZ0KMPETpda114DpL4eMtDswtw@mail.gmail.com
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
The Linux kernel's IOMMU init table framework requires
a depends() callback to check if your feature is detected
or needed. Since the GPXE init solution works with a linker
script to help you sort the init routines by levels we don't
need to require a depends callback for all init routines,
if they don't have one we skip re-sorting them based on
this heuristic.
This will be more useful as we also add support later for
subarchitecture dependency semantics, which we'll do next.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
These will be used later.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Kasan does not work on Xen right now. It does work on bare metal
and kvm though. The reason for this is it does not have the same
setup work as done on early init for x86, as x86's bare metal early
init. It needs some work.
Additionally, you currently cannot disable kasan at run time and even
if you could we wouldn't know that Xen was the reason why this would
be failing. We have no way to annotate this at run time on bootup
for users or even developers. We'd want to annotate this to create
developer awareness over the requirements of such features which
require mucking with different init entries and to prevent code which
should be dead from running.
The work we'll do next will transform kasan to the use the table
init work, we'll also make use of the existing boot protocol subarch work
added through x86 boot protocol 2.08, and add capability support to
the init tables to be able to annotate when a general x86 feature
currenly lacks support for a boot run time environment, identified
through the subarch.
Current simulation which shows sucess on x86 and a replicated
issue when booting xen:
Bare metal simulation:
mcgrof@ergon ~/devel/table-init (git::master)$ ./main
Initializing x86 bare metal world
Number of init entries: 4
Initializing Memory ...
Completed initializing Memory !
Initializing PCI buses ...
Completed initializing PCI buses !
Initializing ACME(TM) Driver ...
Completed initializing ACME(TM) Driver !
Early init for Kasan...
Booting bare metal
Calling start_kernel()...
Calling setup_arch work for Kasan...
Xen simulation:
mcgrof@ergon ~/devel/table-init (git::master)$ ./main -x
Number of init entries: 4
Initializing Memory ...
Completed initializing Memory !
Initializing PCI buses ...
Completed initializing PCI buses !
Initializing ACME(TM) Driver ...
Completed initializing ACME(TM) Driver !
Initializing Xen Driver ...
Completed initializing Xen Driver !
Initializing Xen guest
Booting a Xen guest
Calling start_kernel()...
Kasan was not set up...
----------------------------------------------------------
BUG on kasan_init at kasan.c: 24
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Now that we're done with the demo over how an init section
depends on another let's "fix" the PCI code to let us move
on with what we want to further demo out of this init work.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Without this even though ACME depends on PCI and
PCI is currently set up to fail, the ACME init routine
continues to be called:
Initializing x86 bare metal world
Number of init entries: 4
Initializing Memory ...
Completed initializing Memory !
Initializing PCI buses ...
Failed to initialize PCI buses on early init, but its not critical
Initializing ACME(TM) Driver ...
Completed initializing ACME(TM) Driver !
Early init for Kasan...
Booting bare metal
Calling start_kernel()...
Calling setup_arch work for Kasan...
By setting PCI to be critical we won't traverse on
to ACME's init as it depends on PCI. Technically in
the kernel we don't have things like this affect
boot init, this is just an example / demo of how an
init sequence can depend on another, and how to do
this.
Initializing x86 bare metal world
Number of init entries: 4
Initializing Memory ...
Completed initializing Memory !
Initializing PCI buses ...
Failed to initialize PCI buses on early init
Early init failed
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Print the kernel version string.
Screenshot:
mcgrof@ergon ~/devel/table-init (git::master)$ ./parse-bzimage
~mcgrof/linux/arch/x86/boot/bzImage
kernel: /home/mcgrof/linux/arch/x86/boot/bzImage
kernel size: 5668224 bytes
Going to parse kernel...
-------------------------------------------------
Kernel version: 4.4.0-rc1-1-default+ (mcgrof@ergon.do-not-panic.com) #215 SMP PREEMPT Tue Nov 17 15:48:54 PST 2015
-------------------------------------------------
Xen Expects: 0x53726448
Qemu Expects: 0x53726448
On image: 0x53726448
bzImage protocol Version: v2.13
Xen hdr->version: 525
Qemu protocol: 525
Qemu VERSION(2,8): 520
-------------------------------------------------
Boot protocol 2.07: 0x0207 (supports hardware_subarch)
Boot protocol 2.08: 0x0208
Boot protocol 2.09: 0x0209
Boot protocol 2.10: 0x020a
Boot protocol 2.11: 0x020b
Boot protocol 2.12: 0x020c
Boot protocol 2.13: 0x020d
Member Offset Expected Match
-------------------------------------------------------------------------
setup_header->loadflags 0x0211 0x0211 YES
setup_header->hardware_subarch 0x023c
setup_header->hardware_subarch_data 0x0240
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Remove the -c, that was a typo.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Prior to booting Linux qemu / xen / grub2 needs to parse
the bzimage, they also then re-use some of the same to set
the 0 page (on 32-bit, on 64-bit this can be anywhere) for
Xen HVM and KVM.
This code is added to aid evaluaton of making changes on
qemu / xen / grub / native kvm tool, etc. I at least have
patches ready now for:
* qemu - supports kvm and Xen HVM
* native kvm tool
I'm still evaluating Xen's requirements for PV and PVH, this
is a bit complex due to the requirements of having a custom
boot loader use when one doesn't want to specify the kernel
and initrd manually. For that see:
http://wiki.xenproject.org/wiki/PvGrub2
http://xenbits.xen.org/docs/unstable/misc/x86-xenpv-bootloader.html
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This starts to make use of the hardware_subarch from the
boot_params, available as of the 2.07 x86 boot protocol
to illustrate how it can be used on early init.
X86_FEATURE_HYPERVISOR cannot be used before setup_arch()
as its only available after early_cpu_init(), we want
something that will available as early as the kernel
boots. X86_FEATURE_HYPERVISOR is also very generic.
The hardware_subarch has been used for other x86 subarchs
and at least for 32-bit there is already even a switch for
it on i386_start_kernel().
We keep this code in place for now as we are evaluating
the use complexities of settings this value for the different
boot loaders: xen / qemu / grub2, etc. We also want something
that will be actually be useful for not only xen but also kvm,
so further evaluation of possible dead code for kvm is also
beind done.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Kasan has 2 tier calls, an early init call followed by
a setup_arch() specific call. Our work will be to find
a solution that fits using the tables.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This now mimics the 3 different C level init
entry points that a kernel feature might use.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This lets us split up the init stuff out
replicating the kernel's own init sequence.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
We'll expand on this a bit to make it similar to the
kernel's situation. We'll ignore architecture stuff
though and put everything into main that would otherwise
be arch specific.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Mimic Linux code for xen init. This is the entry point used
from handoff by the Xen Hypervisor. The way the hypervisor
finds finds the entry point is by inspecting the Linux binary,
lookng for the XEN_ELFNOTE_ENTRY elf note and using that.
For bare metal and lguest the entry points are different, and
we'll mimic that next.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
SORT_BY_INIT_PRIORITY() was added with the goal of meshing
.ctor .dtor into .init_array and .fini_array respectively. This
was done to avoid the need for the associated (relative) relocations,
and avoids backwards disk seek on startup (since while .ctors are
processed backwards, .init_array is processed forward) [0] [1].
This later aspect of this change is considered a performance gain
for C++ applications such as Mozilla or Google Chrome. There are
a series of compatibility requirements in place to help sort out old
expected constructor section priority annotations, this is all
now moved onto .init_array. The SORT_BY_INIT_PRIORITY() is very
specific to .init_array sections though, so you'll need to use
__attribute__ ((section (".init_array.65530"))) for instance.
We want the freedom to use custom sections, but in principle
also sort numerically. We don't have support for that yet and even
if we did this would require binutils >= 2.22. As per the kernel's
documentation [2] the current minimum requirements for binutils
is 2.12.
The only reason why SORT_BY_INIT_PRIORITY() *works* is the
implementation resorts back to SORT_BY_NAME() in the case that
the detected priority for both sections it is comparing is 0,
which will be the case for any section that uses
SORT_BY_INIT_PRIORITY() if it is not prefixed by ".init_array.",
".fini_array.", ".ctors.", or ".dtors.". For more details refer
to the implementation of get_init_priority() in binutils
ld/ldlang.c.
The old SORT() could still be used with its own numeric section by
just using your own macros to specify the priority numerically, even
though it will sort in ascending order by name. I'll note that
SORT() is just an alias for SORT_BY_NAME(), this is really the
implementation we'd be relying on.
As far as SORT() / SORT_BY_NAME() is concerned -- the only concern
I can come up with after reading the implementation details of
SORT_BY_INIT_PRIORITY(), even though we won't use it, is that
there are no precise semantics gauranteeing sort order accross
different object files: *technically* the support is there and
this works but as far as I can tell these gaurantees are not
clearly documented formally as part of binutils.
Note that although SORT_BY_INIT_PRIORITY() was designed
specifically for init_array stuff, I noted that it still
resorts to SORT_BY_NAME() as a backup for non init_array
section names, and this is what we want. So if we really
wanted to annotate we want SORT_BY_INIT_PRIORITY() style
of behaviour to enforce a numeric priority identity
in future binutils releases we *could* in theory just
use SORT_BY_INIT_PRIORITY() to annotate all this and the
desire to have this in future binutils *but* not only would
this likely require a separate macro that is not specific
to init, perhaps SORT_BY_PRIORITY(), but also bumping the
binutils minimum requirements from 2.12 to 2.12 or later
if we want a generic new SORT_BY_PRIORITY().
So -- for now just use SORT_BY_NAME().
[0] https://gcc.gnu.org/ml/gcc/2010-12/msg00344.html
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46770
[2] Documentation/Changes as of linux-next next-20151012
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
We can can later repurpose a xen init structure call for white-listing
bare metal specific code.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
To simulate booting xen just pass any parameter to main, for instance;
./main -x
Right now this fails as there is no kasan set up developed yet for Xen.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
pretend this is kasan stuff.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Forgot to add this.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Forgot to add this to the tree. Note that if
we make PCI critical, since its failing right
now we end up with:
Completed initializing Foo thing !
Initializing PCI buses ...
Early init failed
So all of init fails. If we remove the critical
annotation we get:
Initializing PCI buses ...
Failed to initialize PCI buses on early init, but its not critical
Initializing ACME(TM) Driver ...
Completed initializing ACME(TM) Driver !
We want to add support for ACME driver to be able to
annotate it depends on PCI subsystem for this simple
example case but to let the init sequence go on.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
What does not work is if pci fails we should not move on with
PCI drivers, we can try to fix that next. What this should mean is
that if an init sequence has a "depends" and if its marked as
critical then it cannot allow things that depend on it to
continue. Right now if an init sequence is listed as critical
we'll fail the entire init sequence right then where it failed,
we want to enable a middle ground, in such a way to be express
that one should not let things that depend move forward unless
it continued successfully.
Note that in this case the driver had a higher init priority
though originally:
driver: INIT_EARLY
pci: INIT_NORMAL
#define INIT_EARLY 01
#define INIT_SERIAL 02
#define INIT_CONSOLE 03
#define INIT_NORMAL 04
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Well it seems the sort dependency thing actually works.
I can see this if I move the driver to INIT_EARLY and
keep the pci driver on INIT_NORMAL.
If this is done then the driver would run first. This is
obviously wrong, so to fix that we could add an annotation
of having the driver depend on detect_pci, we leave that
commented out first to let you tests and see that for
yourself.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
The driver now depends on the PCI bus, this is annotated,
the tables match the same init level, but the dependency
thing does not seem to work...
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
I haven't tested this yet but this pretty much imports
the kernel's arch/x86/kernel/pci-iommu_table.c solution
for sorting and checking the iommu tables, I've ported
it to this generic solution.
I'll test it next, but it compiles and doesn't break
things as-is. So *other* than the initial linker scripted
sorting, which we get automatically from the generic solution
now, we also support run time dependency annotations which
we later now use at run time to sort out. If some init code
does not need to make any dependency annotations the default
sort should suffice. If they do they can annotate things and
the sort should shift things around for you.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This mimics the kernel's iommu init stuff.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This demoes that now that this returns 0 the init
sequence moves on as expected. No other init sequence
has any critial section defined.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Add a critical boolean entry for each init struct. This lets
init sequences annotate whether or not they are required to
succeed completion before letting init move on with life.
This deviates from the kernel's iommu init sequence stuff, it
uses void for the init sequence routines. We have to modify our
init routine callback to return an int, while we're doing that
just rename the callback to early_init as our next change will
be to add support for a late init.
For now we note that X is the only thing pegged as being critical.
If we add this to the kernel the iommu stuff won't need critical
bools, if they don't have it we'll just continue on as before. This
will be more useful for other areas of the kernel.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Now that we know detect works, sprinkle
detects all around.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This lets a component have a detection routine.
This mimics the struct iommu_table_entry from the
kernel. For now we just add a detect routine to
the x thing.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This lets us do all the printing.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Now that we've demoed the binary let's stick to something
that would look more like we would use upstream on Linux.
We'd want its own section, perhaps right after .iommu_table
or eventually extending / enhancing it by making it more
generic.
For now we keep SORT_BY_INIT_PRIORITY() but annotate that
we may just want SORT() for now. While at it, add an ALIGN(8).
Now the fun begins -- extending our init struct to match
more of our needs. This will be to match requirements of the
kernels' iommu_table hacks (dependencies), but we're also
shooting to make it very generic.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
SORT() sorts lexigraphically, we want to ensure to
sort by a priority number which is exactly what
SORT_BY_INIT_PRIORITY() seems to be used for that
purpose, so use that instead -- note this still requires
some vetting from source code on binutils. Its also
still not clear if since when SORT_BY_INIT_PRIORITY() is
available though. If SORT() is older, what does that
buy us in terms of binutils requirements ? SORT() is
already used in the Linux kernel [0], so if we a solution
that works there perhaps work off of just SORT().
For now we'll work with SORT_BY_INIT_PRIORITY() though
to at least annotate we might want a change. Once homework
is done to clearly distinguish the gains over SORT() we
can decide.
[0] See arch/x86/kernel/vmlinux.lds
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
By using .text and using SORT() we now match the
gpxe solution. To see the .tbl you can use:
objdump -x -j .text main
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Sorting the .tbl section with SORT() now fixes this program.
This deviates from how gpxe's solution still, gpxe moves
all .tbl sections into .text. We'll do that next. This
change was done to show we can stuff this stuff into
any section we want.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
The default linker script will create a section for each
declared .tbl init routine we add, we can't sort these
this way. We also can't see the 00 and 99 tbls, the linker
probably optimizes this out as its all 0'd out anyway but
we need to be sure about the values of those. By adding a
dedicated section .tbl for all .tbl* stuff we can dump it
out with objdump -x -j .tbl main
...
CONTENTS, ALLOC, LOAD, DATA
SYMBOL TABLE:
0000000000601050 l d .tbl 0000000000000000 .tbl
0000000000601060 l O .tbl 0000000000000000 __table_entries.2938
0000000000601060 l O .tbl 0000000000000000 __table_entries.2940
0000000000601060 l O .tbl 0000000000000000 __table_entries.2942
0000000000601060 l O .tbl 0000000000000000 __table_entries.2944
0000000000601058 g O .tbl 0000000000000008 bar_init_fn
0000000000601060 g O .tbl 0000000000000008 x_init_fn
0000000000601050 g O .tbl 0000000000000008 foo_init_fn
0000000000601070 g .tbl 0000000000000000 __bss_start
0000000000601068 g O .tbl 0000000000000008 y_init_fn
0000000000601050 g O .tbl 0000000000000000 .hidden __TMC_END__
...
Its not so obvious that the the __table_entries.29* are the
00 and 99 .tbl entry references though.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
The base linker script was generated using:
CFLAGS +=-Wl,--verbose
Nothing additional was extended to the original default
linker script.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This doesn't yet work. The output when run:
$ ./main Initializing world
Number of init entries: 0
The *reason* this does not work is the table sections are
not linked correctly and there a slew of issues that would
need to be fixed to get that right:
0) We want to somehow keep the pointers around for the sections referring
to statically created variables. Right now they are onloy visible on the
init.o object (see readelf -S init.o below), once the linker
links these into main.o they are lost. We could try to make
the them explicit on main if we compile with "-Wl,--emit-relocs",
but that doesn't solve all of our issues.
1) We want to sort the the sections. As can be seen below they
are not sorted properly so 99 section is actually before 00
right now.
We'll fix this atomically next to make things clear exactly how we
are addressing these shortcomings.
$ readelf -S main
There are 41 section headers, starting at offset 0x38b8:
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .interp PROGBITS 0000000000400238 00000238
000000000000001c 0000000000000000 A 0 0 1
[ 2] .note.ABI-tag NOTE 0000000000400254 00000254
0000000000000020 0000000000000000 A 0 0 4
[ 3] .note.gnu.build-i NOTE 0000000000400274 00000274
0000000000000024 0000000000000000 A 0 0 4
[ 4] .hash HASH 0000000000400298 00000298
0000000000000028 0000000000000004 A 6 0 8
[ 5] .gnu.hash GNU_HASH 00000000004002c0 000002c0
000000000000001c 0000000000000000 A 6 0 8
[ 6] .dynsym DYNSYM 00000000004002e0 000002e0
0000000000000078 0000000000000018 A 7 1 8
[ 7] .dynstr STRTAB 0000000000400358 00000358
0000000000000045 0000000000000000 A 0 0 1
[ 8] .gnu.version VERSYM 000000000040039e 0000039e
000000000000000a 0000000000000002 A 6 0 2
[ 9] .gnu.version_r VERNEED 00000000004003a8 000003a8
0000000000000020 0000000000000000 A 7 1 8
[10] .rela.dyn RELA 00000000004003c8 000003c8
0000000000000018 0000000000000018 A 6 0 8
[11] .rela.plt RELA 00000000004003e0 000003e0
0000000000000060 0000000000000018 AI 6 13 8
[12] .init PROGBITS 0000000000400440 00000440
000000000000001a 0000000000000000 AX 0 0 4
[13] .plt PROGBITS 0000000000400460 00000460
0000000000000050 0000000000000010 AX 0 0 16
[14] .text PROGBITS 00000000004004b0 000004b0
0000000000000262 0000000000000000 AX 0 0 16
[15] .fini PROGBITS 0000000000400714 00000714
0000000000000009 0000000000000000 AX 0 0 4
[16] .rodata PROGBITS 0000000000400720 00000720
00000000000000b1 0000000000000000 A 0 0 4
[17] .eh_frame_hdr PROGBITS 00000000004007d4 000007d4
000000000000005c 0000000000000000 A 0 0 4
[18] .eh_frame PROGBITS 0000000000400830 00000830
00000000000001cc 0000000000000000 A 0 0 8
[19] .init_array INIT_ARRAY 0000000000600e00 00000e00
0000000000000008 0000000000000000 WA 0 0 8
[20] .fini_array FINI_ARRAY 0000000000600e08 00000e08
0000000000000008 0000000000000000 WA 0 0 8
[21] .jcr PROGBITS 0000000000600e10 00000e10
0000000000000008 0000000000000000 WA 0 0 8
[22] .dynamic DYNAMIC 0000000000600e18 00000e18
00000000000001e0 0000000000000010 WA 7 0 8
[23] .got PROGBITS 0000000000600ff8 00000ff8
0000000000000008 0000000000000008 WA 0 0 8
[24] .got.plt PROGBITS 0000000000601000 00001000
0000000000000038 0000000000000008 WA 0 0 8
[25] .data PROGBITS 0000000000601038 00001038
0000000000000010 0000000000000000 WA 0 0 8
[26] .tbl.init_fns.01 PROGBITS 0000000000601048 00001048
0000000000000018 0000000000000000 WA 0 0 8
[27] .tbl.init_fns.04 PROGBITS 0000000000601060 00001060
0000000000000008 0000000000000000 WA 0 0 8
[28] .bss NOBITS 0000000000601068 00001068
0000000000000008 0000000000000000 WA 0 0 1
[29] .comment PROGBITS 0000000000000000 00001068
0000000000000040 0000000000000001 MS 0 0 1
[30] .debug_aranges PROGBITS 0000000000000000 000010b0
00000000000001d0 0000000000000000 0 0 16
[31] .debug_info PROGBITS 0000000000000000 00001280
00000000000008ad 0000000000000000 0 0 1
[32] .debug_abbrev PROGBITS 0000000000000000 00001b2d
00000000000004f4 0000000000000000 0 0 1
[33] .debug_line PROGBITS 0000000000000000 00002021
00000000000003c1 0000000000000000 0 0 1
[34] .debug_frame PROGBITS 0000000000000000 000023e8
0000000000000170 0000000000000000 0 0 8
[35] .debug_str PROGBITS 0000000000000000 00002558
0000000000000431 0000000000000001 MS 0 0 1
[36] .debug_loc PROGBITS 0000000000000000 00002989
000000000000013e 0000000000000000 0 0 1
[37] .debug_ranges PROGBITS 0000000000000000 00002ad0
0000000000000110 0000000000000000 0 0 16
[38] .shstrtab STRTAB 0000000000000000 00002be0
0000000000000190 0000000000000000 0 0 1
[39] .symtab SYMTAB 0000000000000000 00002d70
0000000000000888 0000000000000018 40 65 8
[40] .strtab STRTAB 0000000000000000 000035f8
00000000000002ba 0000000000000000 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), l (large)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
But note the 00 and 99 tables for init on init.o:
$ readelf -S init.o
There are 29 section headers, starting at offset 0x11f8:
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .text PROGBITS 0000000000000000 00000040
000000000000006f 0000000000000000 AX 0 0 1
[ 2] .rela.text RELA 0000000000000000 00000c28
0000000000000108 0000000000000018 I 27 1 8
[ 3] .data PROGBITS 0000000000000000 000000af
0000000000000000 0000000000000000 WA 0 0 1
[ 4] .bss NOBITS 0000000000000000 000000af
0000000000000000 0000000000000000 WA 0 0 1
[ 5] .rodata.str1.1 PROGBITS 0000000000000000 000000af
000000000000003d 0000000000000001 AMS 0 0 1
[ 6] .text.unlikely PROGBITS 0000000000000000 000000ec
0000000000000000 0000000000000000 AX 0 0 1
[ 7] .tbl.init_fns.99 PROGBITS 0000000000000000 000000f0
0000000000000000 0000000000000000 WA 0 0 8
[ 8] .tbl.init_fns.00 PROGBITS 0000000000000000 000000f0
0000000000000000 0000000000000000 WA 0 0 8
[ 9] .tbl.init_fns.01 PROGBITS 0000000000000000 000000f0
0000000000000008 0000000000000000 WA 0 0 8
[10] .rela.tbl.init_fn RELA 0000000000000000 00000d30
0000000000000018 0000000000000018 I 27 9 8
[11] .debug_frame PROGBITS 0000000000000000 000000f8
0000000000000088 0000000000000000 0 0 8
[12] .rela.debug_frame RELA 0000000000000000 00000d48
0000000000000060 0000000000000018 I 27 11 8
[13] .eh_frame PROGBITS 0000000000000000 00000180
0000000000000078 0000000000000000 A 0 0 8
[14] .rela.eh_frame RELA 0000000000000000 00000da8
0000000000000030 0000000000000018 I 27 13 8
[15] .debug_info PROGBITS 0000000000000000 000001f8
00000000000001bd 0000000000000000 0 0 1
[16] .rela.debug_info RELA 0000000000000000 00000dd8
00000000000003d8 0000000000000018 I 27 15 8
[17] .debug_abbrev PROGBITS 0000000000000000 000003b5
00000000000000fe 0000000000000000 0 0 1
[18] .debug_loc PROGBITS 0000000000000000 000004b3
0000000000000023 0000000000000000 0 0 1
[19] .debug_aranges PROGBITS 0000000000000000 000004d6
0000000000000030 0000000000000000 0 0 1
[20] .rela.debug_arang RELA 0000000000000000 000011b0
0000000000000030 0000000000000018 I 27 19 8
[21] .debug_line PROGBITS 0000000000000000 00000506
0000000000000086 0000000000000000 0 0 1
[22] .rela.debug_line RELA 0000000000000000 000011e0
0000000000000018 0000000000000018 I 27 21 8
[23] .debug_str PROGBITS 0000000000000000 0000058c
0000000000000180 0000000000000001 MS 0 0 1
[24] .comment PROGBITS 0000000000000000 0000070c
0000000000000041 0000000000000001 MS 0 0 1
[25] .note.GNU-stack PROGBITS 0000000000000000 0000074d
0000000000000000 0000000000000000 0 0 1
[26] .shstrtab STRTAB 0000000000000000 0000074d
000000000000011b 0000000000000000 0 0 1
[27] .symtab SYMTAB 0000000000000000 00000868
0000000000000318 0000000000000018 28 28 8
[28] .strtab STRTAB 0000000000000000 00000b80
00000000000000a4 0000000000000000 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), l (large)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
Note: if we compile with "-Wl,--emit-relocs" we get:
$ readelf -S main | grep tbl
[32] .tbl.init_fns.01 PROGBITS 0000000000601050 00001050
[33] .rela.tbl.init_fn RELA 0000000000000000 00004bc0
[34] .tbl.init_fns.04 PROGBITS 0000000000601068 00001068
[35] .rela.tbl.init_fn RELA 0000000000000000 00004c08
[36] .tbl.init_fns.99 PROGBITS 0000000000601070 00001070
[37] .tbl.init_fns.00 PROGBITS 0000000000601070 00001070
That still doesn't quite do it for us.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Just remove FILE_LICENCE() usage.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
Imported tables.h from the gpxe project [0] as of
commit c23508d796325f8af724ad497c99ffb5a49abea0.
This code is licensed under the GPLv2.
[0] git://git.etherboot.org/scm/gpxe.git
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
|