.. SPDX-License-Identifier: GPL-2.0

==============================================================================
Concurrent Modification and Execution of Instructions (CMODX) for RISC-V Linux
==============================================================================

CMODX is a programming technique where a program executes instructions that were
modified by the program itself. Instruction storage and the instruction cache
(icache) are not guaranteed to be synchronized on RISC-V hardware. Therefore, the
program must enforce its own synchronization with the unprivileged fence.i
instruction.

CMODX in the Kernel Space
-------------------------

Dynamic ftrace
---------------------

Essentially, dynamic ftrace directs the control flow by inserting a function
call at each patchable function entry, and patches it dynamically at runtime to
enable or disable the redirection. In the case of RISC-V, 2 instructions,
AUIPC + JALR, are required to compose a function call. However, it is impossible
to patch 2 instructions and expect that a concurrent read-side executes them
without a race condition. This series makes atmoic code patching possible in
RISC-V ftrace. Kernel preemption makes things even worse as it allows the old
state to persist across the patching process with stop_machine().

In order to get rid of stop_machine() and run dynamic ftrace with full kernel
preemption, we partially initialize each patchable function entry at boot-time,
setting the first instruction to AUIPC, and the second to NOP. Now, atmoic
patching is possible because the kernel only has to update one instruction.
According to Ziccif, as long as an instruction is naturally aligned, the ISA
guarantee an  atomic update.

By fixing down the first instruction, AUIPC, the range of the ftrace trampoline
is limited to +-2K from the predetermined target, ftrace_caller, due to the lack
of immediate encoding space in RISC-V. To address the issue, we introduce
CALL_OPS, where an 8B naturally align metadata is added in front of each
pacthable function. The metadata is resolved at the first trampoline, then the
execution can be derect to another custom trampoline.

CMODX in the User Space
-----------------------

Though fence.i is an unprivileged instruction, the default Linux ABI prohibits
the use of fence.i in userspace applications. At any point the scheduler may
migrate a task onto a new hart. If migration occurs after the userspace
synchronized the icache and instruction storage with fence.i, the icache on the
new hart will no longer be clean. This is due to the behavior of fence.i only
affecting the hart that it is called on. Thus, the hart that the task has been
migrated to may not have synchronized instruction storage and icache.

There are two ways to solve this problem: use the riscv_flush_icache() syscall,
or use the ``PR_RISCV_SET_ICACHE_FLUSH_CTX`` prctl() and emit fence.i in
userspace. The syscall performs a one-off icache flushing operation. The prctl
changes the Linux ABI to allow userspace to emit icache flushing operations.

As an aside, "deferred" icache flushes can sometimes be triggered in the kernel.
At the time of writing, this only occurs during the riscv_flush_icache() syscall
and when the kernel uses copy_to_user_page(). These deferred flushes happen only
when the memory map being used by a hart changes. If the prctl() context caused
an icache flush, this deferred icache flush will be skipped as it is redundant.
Therefore, there will be no additional flush when using the riscv_flush_icache()
syscall inside of the prctl() context.

prctl() Interface
---------------------

Call prctl() with ``PR_RISCV_SET_ICACHE_FLUSH_CTX`` as the first argument. The
remaining arguments will be delegated to the riscv_set_icache_flush_ctx
function detailed below.

.. kernel-doc:: arch/riscv/mm/cacheflush.c
	:identifiers: riscv_set_icache_flush_ctx

Example usage:

The following files are meant to be compiled and linked with each other. The
modify_instruction() function replaces an add with 0 with an add with one,
causing the instruction sequence in get_value() to change from returning a zero
to returning a one.

cmodx.c::

	#include <stdio.h>
	#include <sys/prctl.h>

	extern int get_value();
	extern void modify_instruction();

	int main()
	{
		int value = get_value();
		printf("Value before cmodx: %d\n", value);

		// Call prctl before first fence.i is called inside modify_instruction
		prctl(PR_RISCV_SET_ICACHE_FLUSH_CTX, PR_RISCV_CTX_SW_FENCEI_ON, PR_RISCV_SCOPE_PER_PROCESS);
		modify_instruction();
		// Call prctl after final fence.i is called in process
		prctl(PR_RISCV_SET_ICACHE_FLUSH_CTX, PR_RISCV_CTX_SW_FENCEI_OFF, PR_RISCV_SCOPE_PER_PROCESS);

		value = get_value();
		printf("Value after cmodx: %d\n", value);
		return 0;
	}

cmodx.S::

	.option norvc

	.text
	.global modify_instruction
	modify_instruction:
	lw a0, new_insn
	lui a5,%hi(old_insn)
	sw  a0,%lo(old_insn)(a5)
	fence.i
	ret

	.section modifiable, "awx"
	.global get_value
	get_value:
	li a0, 0
	old_insn:
	addi a0, a0, 0
	ret

	.data
	new_insn:
	addi a0, a0, 1