Add final version of master thesis.

2020-08-20 11:39:46 -04:00
parent 6ba2097e6b
commit 6ff06af4ff
94 changed files with 30356 additions and 1 deletions
@@ -0,0 +1,28 @@
+\chapter*{Abstract}
+
+Embedded real-time multi-core systems must adhere to strict timing requirements
+in order to guarantee correct execution.  Timing requirements are specified to
+document system execution paths that are safety critical with respect to the
+timing behavior of an application. 
+
+Via tracing it is possible to validate the fulfillment of timing requirements
+in the native environment of a microcontroller.  However, trace tools produce a
+trace on hardware or software level, whereas requirements are specified on
+system level.  A transformation of the former to the latter is required to
+close this gap.
+
+Additionally, not all trace techniques are capable of producing results
+suitable for the real-time analysis of embedded applications.  Most techniques
+are not sufficient for one or several reasons: limited trace duration,
+inadequate number of recordable objects, and limited timing accuracy.
+
+Therefore, this thesis examines different trace techniques and shows why
+hardware tracing is the most sufficient for real-time analysis.  Next, the
+coherence between hardware, software, and system level entities is examined.
+Based on the results a mapping from software level to system level is
+introduced and validated.  
+
+The thesis concludes that it is possible to record cycle accurate system traces
+of arbitrary length via hardware tracing.  However, this requires detailed
+knowledge about hardware tracing and the operating system underlying an
+application.
@@ -0,0 +1,83 @@
+\chapter{Conclusion}
+\label{chapter:conclusion}
+
+\subsubsection{Cycle Accurate Tracing}
+
+Hypothesis~\ref{hyp:1} asks whether there is a trace technique capable of
+recording cycle accurate traces with a duration of at least one second.  There
+exists three general measurement techniques.  Hybrid and software based trace
+tools rely on instrumentation.  Thus, they change the runtime behavior of
+an application and do not allow cycle accurate trace recording.
+
+Additionally, an on-chip memory to buffer the recorded trace events is
+required.  Hence, the trace duration is strongly limited by the available
+memory.  An application with 28 tasks can only be traced for \unit[350]{ms}
+using the Gliwa T1 hybrid trace tool \cite{kastner2011integrated} providing
+events solely on task level.  Runnables were not even considered.
+
+Hardware tracing is the only trace technique that allows cycle accurate traces
+with a duration of at least one second.  Actually, durations of over ten
+seconds are possible with the correct hardware configuration.
+
+However, there are certain limitations for the hardware platform used in this
+thesis.  Depending on the clock configuration not all data events are recorded.
+This can be avoided by using a CPU core clock frequency smaller or equal than
+\unit[160]{MHz}.  Therefore, Hypothesis~\ref{hyp:1} is true.
+
+
+\subsubsection{ORTI Based Software to System Mapping}
+
+Hardware trace tools create traces on software level.  This level is not
+sufficient for the real-time analysis of embedded systems.  A transformation
+from software to system level is therefore required.  \gls{orti} was designed
+to give third party tools additional information for the trace recording of
+applications that use an \gls{osek} compliant \gls{os}.  Hypothesis~\ref{hyp:2}
+asks if \gls{orti} is sufficient to create a complete mapping from software to
+system level.
+
+It has been shown that \gls{orti} can be used to cover only a subset of the
+\gls{os} entity types specified in the \gls{btf} standard.  Even for those
+entities covered by \gls{orti} no complete mapping is feasible.  For example,
+information about task entities is included in the \gls{orti} file, but it is
+not feasible to determine the source entity for a \emph{mtalimitexceeded}
+event.  Consequently, Hypothesis~\ref{hyp:2} does not hold.
+
+However, it should be noted that \gls{orti} allows it to specify \gls{os}
+vendor specific attributes.  This means in case a mapping is basically possible
+as claimed by Hypothesis~\ref{hyp:3} then it would be possible to include the
+required information in the \gls{orti} file.
+
+Nevertheless, to the best of my knowledge this thesis is the first work to show
+that \gls{btf} \emph{trigger} actions and all process actions except
+\emph{mtalimitexceeded} can be created based on the \gls{orti} sections
+specified by \gls{osek}.
+
+
+\subsubsection{Software to System Mapping}
+
+No complete mapping from software to system entities is feasible by relying
+solely on the information in the \gls{orti} file.  Additional information is
+required to achieve a complete mapping.  On the one hand a detailed
+understanding of the \gls{os} internals is required, on the other hand meta
+information must be provided to the transformation algorithm.
+
+The concept of runnables and signals is not specified by \gls{osek}.
+Basically, runnables are functions and signals are variables.  It is possible
+to create runnable and signal events via function and data tracing.  A list of
+all entities is required to distinguish regular functions from runnables and
+regular variables from signals.
+
+To create \gls{btf} events for the event entity type it is necessary to
+understand the respective code of the \gls{os}.  By parsing the statically
+created C header files the event \glspl{id} can be retrieved and the correct
+events can be created.
+
+Semaphore events are the most complex entity types to reconstruct via
+hardware tracing.  \gls{btf} supports all possible types of semaphore like
+synchronization mechanisms.  Hence, a variety of different actions are
+specified.  A possible mapping for \gls{osek} resource entities is nevertheless
+provided in this thesis.
+
+To the best of my knowledge this is the first work to show that all \gls{btf}
+signal, runnable, event, and semaphore actions can be recreated from an
+\gls{osek} compliant \gls{os}.  Therefore, Hypothesis~\ref{hyp:3} is true.
@@ -0,0 +1,16 @@
+\section*{Declaration of original authorship}
+\addcontentsline{toc}{section}{\protect\numberline{\thesection}Declaration of original authorship}
+\stepcounter{section}
+
+\begin{itemize}
+  \item[] Mir ist bekannt, dass dieses Exemplar der Masterarbeit als Prüfungsleistung in das Eigentum des Freistaates Bayern übergeht.
+  \item[] Ich versichere, dass ich die vorliegende Arbeit selbständig verfasst und außer den angeführten keine weiteren Hilfsmittel benützt habe.
+  \item[] Soweit aus den im Literaturverzeichnis angegebenen Werken und Internetquellen einzelne Stellen dem Wortlaut oder dem Sinn nach entnommen sind, sind sie in jedem Fall unter der Angabe der Entlehnung kenntlich gemacht.
+  \item[] Die Versicherung der selbständigen Arbeit bezieht sich auch auf die in der Arbeit enthaltenen Zeichen-, Kartenskizzen und bildlichen Darstellungen.
+  \item[] Ich versichere, dass meine Masterarbeit bis jetzt bei keiner anderen Stelle veröffentlicht wurde. Mir ist bewusst, dass eine Veröffentlichung vor der abgeschlossenen Bewertung nicht erfolgen darf.
+  \item[] Ich bin mir darüber im Klaren, dass ein Verstoß hiergegen zum Ausschluss von der Prüfung führt oder die Prüfung ungültig macht.
+\end{itemize}
+
+\vspace{2cm}
+
+Regensburg, den 28.10.2015
@@ -0,0 +1,10 @@
+\chapter{Fundamentals}
+\label{chapter:fundamentals}
+
+This thesis discusses the transformation of hardware events to system events
+for \gls{osek} compliant real-time \glspl{os}.  Hence, the parts of \gls{osek}
+that are relevant for this thesis are described in the following.
+
+Additionally, a well-defined format is required to represent the resulting
+traces consisting of entities on system level.  The \gls{btf} format which is
+discussed in \autoref{chapter:btf} is used within the context of this work.
@@ -0,0 +1,85 @@
+\chapter{Future Work}
+\label{chapter:future_work}
+
+\subsubsection{Improve Trace Interface Standard}
+
+It has been shown that a complete software to system mapping is possible for an
+\gls{osekos} and should accordingly also be possible for an \gls{autosaros}.
+However, detailed knowledge of the \gls{os} is required to understand and
+implement this mapping.  \gls{osek} tries to minimize this effort via the
+\gls{orti} trace interface.  Unfortunately, this interface is only regulated
+for a subset of all \gls{os} entity types.
+
+Some entities like spinlocks, semaphores, and inter-process communication
+techniques like \gls{autosar} sender-receiver-communication are not covered at
+all.  In theory \gls{osek} allows it to add additional attributes to the
+\gls{orti} file, but this option is currently not comprehensively used by the
+\gls{os} vendors.  To solve this problem further efforts to reach a common
+trace interface standard for all \gls{autosar} system entities should be made.
+
+\subsubsection{Evaluate Different Hardware Platforms}
+
+In this thesis the feasibility of recording cycle accurate hardware traces was
+validated for the Infineon Aurix TriCore processor family using the Infineon
+Multi-Core Debug System.  As described in \autoref{subsection:hardware_tracing}
+there exists different trace standards for other processor families.
+
+In order to achieve a better understanding of the trace capabilities of various
+hardware platforms different other processor families should be tested in the
+future.  It has been shown that cycle accurate recording of data events on the
+Infineon TC298TF processor is only feasible for certain clock settings.  It
+would be interesting to know if similar constraints also exist for other
+platforms.
+
+\subsubsection{Evaluate Different Operating Systems}
+
+\glsdesc{ee} is used as a representative for an \gls{osek} compliant \gls{os}
+in this thesis.  It is a sufficient choice because of the available source code
+and the permissive license.  For \gls{ee}, it could be shown that a mapping
+from software to system entities is feasible.
+
+However, \gls{osek} has been taken over from \gls{autosar}.  Since
+\gls{autosar} is a superset of \gls{osek} the reasoning for most system
+entities is legitimate for both \gls{os} standards. Nevertheless, \gls{autosar}
+introduces new synchronization patterns (of which some have been adopted by
+\gls{ee}) and it would be interesting to know if a mapping is possible for
+those new techniques as well.
+
+Additionally, a complete mapping could only be created because the source code
+of \gls{ee} is freely available.  It would be interesting to know if the same
+approach is feasible for a commercial \gls{os} that does not make its source
+code available.  This is an important question to answer since the automotive
+industry relies predominantly on commercial \glspl{os}.
+
+\subsubsection{Validate Mapping With Real World Applications}
+
+Finally, the feasibility of the software to system mapping has been shown and
+validated for several test applications.  One part of those applications was
+created manually to cover specific test cases, the other part was created
+randomly.   However, all test applications have in common that they do not
+execute real functionality.  Instead, dummy instructions are used to simulate
+runtime that would emerge on real hardware due the computation of algorithms
+and feedback loops.
+
+It may be possible that the trace capability of the tested hardware is limited
+for real applications.  If this is the case the mapping introduced in this
+thesis may not be completely applied in the real world for example because
+the bandwidth for recording \gls{os} data events is limited.  To investigate
+this question industrial case studies should be conducted based on the
+approaches discussed in this thesis.
+
+\subsubsection{Trace a Multi-ECU Setup}
+
+In many environments microcontrollers operate in big networks.  For example, in
+modern cars up to 70 ECUs are installed and connected via at least five
+different field bus systems \cite{maxmaster}.  In such systems correct system
+performance is not only dependent on the behavior of a single controller, but
+also on the interaction of the system as a whole.  The ability to trace
+multiple ECUs in parallel would provided enormous benefits in the analysis and
+validation of multi-ECU systems.
+
+In order to get meaningful results from the analysis of a multi-ECU trace it is
+mandatory that the timestamps from all ECUs are synchronous. Otherwise, the
+delay between different processor would result in wrong evaluation metrics and
+no valid conclusions could be drawn.  Therefore, the feasibility of a multi-ECU
+trace environment is an interesting and important topic for future work.
@@ -0,0 +1,515 @@
+\chapter{Hardware Trace Measurement}
+\label{section:trace_measurement}
+
+Computer systems can be analyzed with measurement tools that detect events,
+i.e.\ changes in the state of a system \cite[p. 28]{ferrari1978computer}.  The
+same event can be interpreted on different levels as shown in
+\autoref{fig:trace_event_levels}.  A hardware trace tool can detect a voltage
+change in memory, e.g.\ triggered by the processor which is a hardware event.
+Accordingly, the variable that maps to the changed memory register changes too
+which is a software event.  If this variable is related to the state of a task,
+a change of the variable also means a change of the task state which is then
+called a system event.
+
+In many cases, the event of interest cannot be measured directly.  One or more
+transformation steps are required to retrieve the required result.  If a
+transformation process is executed the measurement is said to be indirect
+\cite[p. 28]{ferrari1978computer}.  Considering the previous example a task
+termination event cannot be measured directly.  However, a variable that
+contains the current task state can be measured.  If the task corresponding
+to the variable and the mapping from value to task state is known, a change of
+the variable can be transformed into a higher level event the termination of a
+task.  After the transformation process the measurement results can be
+displayed to the user as shown in \autoref{fig:concept_measurement}.
+
+\begin{figure}[]
+ \centering
+ \includegraphics[width=\textwidth]{./media/trace/concept_measurement.pdf}
+ \caption[Measurement process]{The conceptual parts of a measurement process
+ according to Ferrari \cite{ferrari1978computer}.  A sensor measures data.  One
+ or more transformation steps are required if the data is not yet in the
+ desired format.  Finally the result can be presented to the user.}
+ \label{fig:concept_measurement}
+\end{figure}
+
+During the transformation step the collected data may be manipulated which is
+called prereduction.  Prereduction may for example be used when the actual
+event is not required, but rather the amount of events of a certain type that
+occurred.  For this case the transformer would increment a counter whenever a
+certain event type is collected.  If no prereduction is executed, the
+measurement process is called tracing.  Tracing is the process of recording a
+sequence of events in chronological order of occurrence \cite[p.
+30]{ferrari1978computer}.  The result of this process is called a trace.
+
+\section{Trace Tools}
+
+Ferrari \cite[p. 31ff]{ferrari1978computer} distinguishes three trace
+measurement tools: software, hybrid, and hardware tools.  All tools are meant
+to examine the behavior of a system.  However, there are differences in
+interference, resolution, and cost as summarized in
+\autoref{tab:trace_tool_overview}.
+
+If a measurement tool uses resources of the target system it causes
+interference by using computational power and memory that could otherwise be
+utilized by the application.  A tool that causes interference is said to be
+intrusive and may cause degradation, a reduction in performance of the target
+system \cite[p. 29]{ferrari1978computer}.  Consequently, intrusive trace tools
+change the real-time behavior of an application.
+
+An event can be represented on different levels.  A voltage level change in
+memory can map to a variable which can map to the state of a task as
+visualized in \autoref{fig:trace_event_levels}.  Those levels are called
+hardware level, software level, and system level.  To clarify the level of a
+trace, it can be mentioned explicitly. For instance, a trace consisting of
+hardware level events is a hardware level trace \cite[p.  29f]{felixproject2}.
+Tools that can detect hardware events occurring at a microscopic level are
+said to have a higher resolution than tools that can detect software events
+only.
+
+\begin{figure}[]
+ \centering
+ \includegraphics[width=\textwidth]{./media/trace/trace_event_levels.pdf}
+ \caption[Measurement levels]{A measurement event can be interpreted on
+ different levels.  A voltage change in memory can be detected by a hardware
+ trace tool capable of supervising the memory bus that triggers the voltage
+ change.  The memory section can relate to a variable, that changes in
+ consequence of the voltage change, which is a software event.  If the variable
+ is related to the state of a task, a change of the variable also means a
+ change of the task state which is then called a system event.}
+ \label{fig:trace_event_levels}
+\end{figure}
+
+Different trace techniques can detect and record events with different
+frequencies.  The maximum frequency is usually not limited by the speed with
+which events can be detected, but by the available bandwidth to process and
+record the detected events.
+
+The cost of different trace tools depends on several factors,  the price for
+hardware and software licenses,  the price for installing and maintaining the
+tool, educational costs, like training for the users of a tool, and the costs
+of operating the tool.
+
+\textbf{Software tools} add instructions to a hardware-software system in order
+to detect and record events of interest.  Added instructions are called
+instrumentation.  The simplest kind of instrumentation is a classical write to
+the standard output interface, e.g.\ a \lstinline{printf} statement in the C
+programming language.  Instructions may be added to the application code
+directly, via the compiler or post compilation via dynamic binary
+instrumentation \cite{trumper2012maintenance}\cite{felixarc2015}.  If no
+standard output interface is available,  events are recorded into memory on
+target.  From there they can be read out via debugger or serial interface.
+Instrumentation always interferes with the application.  There are two
+components of interference, a space, and a time component \cite[p.
+44]{ferrari1978computer}.  Execution of instrumentation code takes time and
+storing detected events uses memory space.  Software tools have a low
+resolution because they cannot detect events on a hardware level.  Event
+detection frequency is limited by the available computational resources.  On
+the upside they are usually cheap and easy to implement and use.
+
+\textbf{Hardware tools} do not rely on instrumentation which means that they
+are non intrusive and do not interfere with the application
+\cite{felixarc2014}.  Hardware tracing works via a dedicated trace device chip
+that is located on the silicon of the CPU\@.  Trace devices provide a very high
+resolution since they are capable of detecting events at hardware level
+\cite{mink1989performance}.  Additionally the event detection frequency can be
+as high as the actual system frequency,  thus it is possible to record a
+complete hardware-software system in real-time.  Hardware tools are more
+expensive compared to software solutions.  Installation and maintenance are
+more complex and require properly qualified users.
+
+\textbf{Hybrid tools} rely on instrumentation and  a dedicated hardware
+interface to record events.  The boundary between software, hybrid, and
+hardware tools can be fuzzy in certain cases.  Software tools need some kind of
+hardware interface to send recorded traces off-chip.  In this sense, all
+software tools are hybrid tools.  However, industry hybrid solutions often
+require proprietary target interfaces which justifies why these tools fit into
+a separate category \cite{richterganzheitliche}.  Compared to pure software
+tools, hybrid tools interfere with the system to a lesser extent
+\cite{nacht1989hardware}.  A dedicated hardware interface allows it to send
+events off-chip in real-time.  Consequently, more memory becomes available on
+target.
+
+As shown in \autoref{tab:trace_tool_overview} hardware trace tools have many
+advantages over hybrid and software based solutions.  Hardware tracing does not
+interfere with the system, which is especially important for real-time systems.
+Hardware trace tools are capable of detecting events with a higher resolution
+and frequency.  Additionally the trace duration of software and hybrid traces
+is limited to the available memory on target and to the trace interface
+bandwidth.  When the same quantity can be measured by a hardware and a software
+tool, the values obtained by the hardware tool are usually to be considered
+more accurate because of the lower interference \cite[p.
+45]{ferrari1978computer}.
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{r|c c c}
+                 & Software & Hybrid & Hardware \\
+    \hline
+    Interference & high     & low    & no   \\
+    Resolution   & low      & low    & high \\
+    Cost         & low      & low    & high \\
+    Frequency    & low      & low    & high \\
+  \end{tabular}
+  \caption[Trace techniques]{Properties of different trace
+  measurement tools \cite[p.  6]{felixproject1}.  Hardware tools are superior
+  to software and hybrid tools but come with higher expenses.}
+  \label{tab:trace_tool_overview}
+\end{table}
+
+\section{Hardware Tracing}
+\label{subsection:hardware_tracing}
+
+Hardware tracing is capable of recording events on hardware level.  A dedicated
+on-chip trace device and trace interface is required to record hardware events
+and send them off-chip \cite{mink1990multiprocessor}.  Target access hardware
+is connected to the trace interface to readout the trace measurement results.
+From there the events are forwarded to a host computer for further processing.
+Software that runs on the host computer in order to analyze the recorded trace
+data is provided by the target access hardware vendor \cite{winidea}.  The term
+host software is used to refer to such applications.
+
+The on-chip trace device is designed to record hardware events executed by the
+microcontroller.  It occupies a separate section on the silicon.  Usually a
+controller is delivered in two versions, one with and one without trace device.
+In production the ability to execute trace measurement is not required
+\cite{felixarc2014}.  Therefore, the trace device would only increase chip
+costs without providing any benefits.
+
+\begin{figure}[]
+ \centering
+ \includegraphics[width=\textwidth]{./media/trace/tc27_emulation_device.png}
+ \caption[Infineon TC27x trace device]{A microcontroller with hardware trace
+ support consists of two sections.  A regular product chip part and the trace
+ device part.  The trace device part can be omitted in the production version
+ of a chip to save costs \cite{tc27block}.}
+ \label{fig:tc27_emulation_device}
+\end{figure}
+
+\autoref{fig:tc27_emulation_device} shows the trace device of the Infineon
+TC27x microcontroller family \cite{tc27x}.  The upper part belongs to the
+product chip while the lower part displays the trace device.  The trace device
+can gather data from the product part via two interfaces.  \glspl{pob}
+(\glsdesc{pob}) record processor events while \glspl{bob} record bus events.
+All events are collected, enhanced with a timestamp and buffered in the on-chip
+trace memory.  From there they are sent off-chip via the dedicated trace
+interface.
+
+
+\begin{figure}[]
+ \centering
+ \includegraphics[width=\textwidth]{./media/trace/timestamp_generation_event.pdf}
+ \caption[Timestamp per event]{Each trace event is assigned a timestamp
+ relative to the previous event.  By summing up the relative timestamps
+ absolute values can be generated.}
+ \label{fig:timestamp_generation_event}
+\end{figure}
+
+\begin{figure}[]
+ \centering
+ \includegraphics[width=\textwidth]{./media/trace/timestamp_generation_dedicated.pdf}
+ \caption[Dedicated timestamp generation]{Via dedicated timestamp events, the
+ timestamps of the other events can be interpolated.  In this example two
+ events are recorded between the previous and the next timestamp event.  This
+ is why both events get the same timestamp, based on these events.  The value
+ is calculated via \autoref{eq:timestamp_interpolation} as $t_i = 5 +
+ \frac{(15-5)}{2}=10$.}
+ \label{fig:timestamp_generation_dedicated}
+\end{figure}
+
+
+\begin{figure}[]
+ \centering
+ \includegraphics[width=\textwidth]{./media/trace/timestamp_generation_io.pdf}
+ \caption[Timestamp via \gls{io}]{Dedicated \gls{io} pins can be used to output
+ a timestamp value whenever a measurement event is sent off-chip.}
+ \label{fig:timestamp_generation_io}
+\end{figure}
+
+There exist different techniques to add timestamp information to a trace event.
+The obvious way is shown in \autoref{fig:timestamp_generation_event}.  A
+timestamp is added to each trace event that is sent off-chip.  To save
+bandwidth timestamps are provided relatively to the previous event.  An
+absolute value is computed by summing up all previous timestamp.
+
+Another way is to send dedicated timestamp messages as shown in
+\autoref{fig:timestamp_generation_dedicated}.  The timestamps for the actual
+trace events are then interpolated, e.g., via the equation
+
+\begin{equation}
+\label{eq:timestamp_interpolation}
+  t_{i} = t_p + \frac{(t_n - t_p)}{2},
+\end{equation}
+
+where $t_p$ is the previous timestamp (the latest timestamp before the event),
+$t_n$ the next timestamp (the soonest timestamp after the event) and $t_i$ the
+timestamp interpolated based on the dedicated timestamp events.
+
+Finally, timestamps can also be created via dedicated \gls{io} pins as
+specified by the Nexus \cite{turley2004nexus} standard.  This means that
+whenever a trace event is sent off-chip via the trace interface, the current
+timestamp is provided via the \gls{io} pins as shown in
+\autoref{fig:timestamp_generation_io}.
+
+Cycle accurate timestamps are feasible with all timestamp generation
+techniques.  However, timestamp accuracy and resolution are only partly
+dependent on the generation technique.  More important factors are CPU and
+trace device clock frequency, as well as the design of CPU and trace device.
+For cycle accurate timestamps, trace device frequency must be greater or equal
+to CPU frequency.  Even if this is the case, cycle accurate time\-stamps cannot
+necessarily be guaranteed.
+
+For example, super scalar processors like the Infineon TC277 \cite{tc27x} are
+capable of executing more than one instructions per cycle.  However, only one
+event can be processed per cycle by the trace device as shown in
+\autoref{fig:timestamp_cycle}.  The processor observation block filters the
+instructions according to user specified filter rules and forwards them for
+further processing.  If two instructions, executed during the same processor
+cycle, match the filter and are thus forwarded to the trace device, one of
+those instructions is delayed by one cycle (in this example Instruction 2.1).
+For a processor running at \unit[100]{MHz} this would set the timestamp off by
+\unit[10]{ns} for this particular event.
+
+\begin{figure}[]
+ \centering
+ \includegraphics[width=\textwidth]{./media/trace/timestamp_cycle.pdf}
+ \caption[Timestamp generation accuracy]{Even if the trace device runs at CPU
+ clock frequency, cycle accurate timestamps cannot be guaranteed.}
+ \label{fig:timestamp_cycle}
+\end{figure}
+
+The design of trace devices differs depending on the processor family and the
+processor vendor.  However, the general concept and provided functionality are
+the same for all devices.  Various standards for the implementation of
+trace devices are specified and used by chip vendors.  Three common standards
+are Nexus used by PowerPC processors \cite{turley2004nexus}, \gls{etm}
+(\glsdesc{etm}) used by ARM processors \cite[p.  476]{yiu2013definitive}, and
+the \glsdesc{imds} \cite{stollon2011infineon} discussed here and shown in
+\autoref{fig:tc27_emulation_device}.
+
+According to \autoref{fig:concept_measurement}, a measurement process starts
+with the detection of an event by a sensor.  In case of the trace process the
+sensors are the \glspl{pob} and \glspl{bob}.  Each \gls{pob} monitors the
+instructions executed by one processor core.  This means the complete program
+flow executed by a processor core can be recorded.  \glspl{bob} are connected
+to the data busses of the microcontroller and can detect memory access events.
+A memory access event may be for example, writing to a variable or reading
+from a special function register.  A typical data trace event contains in
+addition to the timestamp, details like address, data value, transfer size, and
+whether a read or write access occurred \cite{hopkins2006debug}.
+
+Filters can be specified by the user to reduce the amount of recorded trace
+events.  They can be set for an address or for an address range.  Different
+events can be executed if an address filter matches: the corresponding event
+can be recorded, discarded or another event can be triggered.  For example, it
+is possible to start or stop the trace process if a specific function is
+accessed or a variable is written.  Filter configuration is done via the host
+software.
+
+Corresponding to the two main hardware event types, instruction, and data
+access events, two hardware trace techniques can be distinguished, program flow
+trace and data trace \cite{felixarc2014}.  The two trace techniques can be
+executed in parallel or individually as configured by the user.
+
+A \textbf{program flow trace} (also called function trace) shows the complete
+execution path of an application for the duration of the trace recording.  This
+means it is possible to detect when a certain function is called or which
+branch of an if statement is executed.  The amount of instructions and the
+resulting data stream bandwidth produced by a modern CPU is too big to be
+transmitted via the trace interface.  To solve this problem trace devices use
+trace compression.  The most commonly used program flow trace compression
+technique works by detecting and recording only such instructions that cause a
+change in program flow such as conditional jumps and traps
+\cite{hopkins2006debug}.  Using the application binary the host software is
+able to reconstruct the complete program flow.
+
+A \textbf{data trace} is a sequence of data access events.  Data tracing allows
+it to supervise and to debug the state of variables in memory.  Data tracing of
+all active units is becoming increasingly important because not all data
+interactions involve a processor \cite{mayer2003debug}.  Thus, trace devices
+must also be able to detect memory accesses via \gls{dma} (\glsdesc{dma}) and
+accesses to memory of special on-chip modules like FlexRay or Ethernet.  The
+units that are supported by a microcontroller are depended on the trace device,
+but all trace devices support tracing the main memory of a controller.
+Compression is also applied to data traces.  However, those techniques are
+usually not sufficient to record a complete data trace of significant length
+since the amount of generated data is too big.  The best way to solve this
+problem is to apply filters to avoid detecting and recording data events in
+memory sections that are not of interest \cite{hopkins2006debug}.
+
+A recorded hardware trace event is buffered into an on-chip trace memory.  From
+there the events can be read via the trace interface.  On-chip trace memories
+can be operated in different modes \cite{felixarc2014}.  In continuous mode
+the trace data is streamed of chip in real-time.  This technique is limited by
+the bandwidth of the trace interface.  If it is high enough the trace duration
+is only depended on the available memory on the host computer and traces of
+arbitrary length can be recorded.  If the bandwidth is too small to process the
+recorded trace stream \emph{buffer mode} must be used.  This means the recorded
+trace is written into trace memory and read out by the target access hardware
+post tracing.  Buffer mode can be used in pre- and post-trigger mode.  In
+pre-trigger mode the trace buffer is filled like a circular buffer.  The oldest
+events are discarded for new events.  The trace process can be stopped at an
+arbitrary point in time and the latest trace events become available.  In
+post-trigger mode the trace process is stopped as soon as the buffer has been
+filled for the first time.
+
+A trace device operated in buffer mode is limited by the available trace
+memory. The trace memory size of an Infineon TC275 microcontroller
+(\autoref{fig:workbench} a)is \unit[2]{MB} which allows for approximately
+\unit[33]{ms} of unfiltered function and data trace of a single processor core
+running at \unit[200]{MHz} \cite{felixarc2014}.  Depending on the measurement
+use case this may be sufficient or not.  If the trace duration should be
+increased tracing in continuous mode is mandatory.  Continues tracing requires
+a high bandwidth interface such as \gls{agbt} (\glsdesc{agbt}).
+
+\section{Hardware Trace Toolchain}
+
+Multiple steps are required from recording a hardware trace on target to
+presenting it to the user on a personal computer as shown in
+\autoref{fig:toolchain}.  Many different solutions exist for each of those
+steps.  Nevertheless, the basic functionalities provided by all solutions is
+comparable to each other.
+
+\begin{figure}[]
+ \centering
+ \includegraphics[width=\textwidth]{./media/trace/toolchain.pdf}
+ \caption[Trace toolchain]{Recording a hardware trace and making it
+ available to the user requires multiple steps.  Hardware events must be
+ measured on target via a trace device.  Using a trace interface the recorded
+ data can be readout by the target access hardware and transmitted to a host
+ computer.  Target access hardware vendors provide special software to analyze
+ and visualize the recorded trace.}
+ \label{fig:toolchain}
+\end{figure}
+
+The basic prerequisite for executing a hardware trace is the availability of an
+on-chip trace device.  All major chip vendors provide trace devices for their
+microcontrollers that support program flow and data trace.
+\autoref{tab:trace_devices} gives an overview of the state-of-the-art trace
+solutions.
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{r|c c c}
+    Standard    & Architecture & Function Trace & Data Trace\\
+    \hline
+    Nexus                       &
+    PowerPC                     &
+    \begin{tabular}[x]{@{}c@{}}  Branch Trace \\ Messaging \end{tabular}  &
+    \begin{tabular}[x]{@{}c@{}}  Data Trace \\ Messaging \end{tabular}    \\
+    \hline
+    \gls{etm}                   &
+    ARM                         &
+    \begin{tabular}[x]{@{}c@{}}Program Trace \\ Macrocell   \end{tabular} &
+    \begin{tabular}[x]{@{}c@{}}Embedded Trace \\ Macrocell  \end{tabular} \\
+    \hline
+    \gls{imds}                  &
+    TriCore                     &
+    \begin{tabular}[x]{@{}c@{}}Processor \\ Observation Block \end{tabular} &
+    \begin{tabular}[x]{@{}c@{}}Bus \\ Observation Block       \end{tabular} \\
+  \end{tabular}
+  \caption[Trace devices for different architectures]{Trace devices exist for
+  different CPU architectures.  All solutions provide methods for recording
+  program flow and data traces.}
+  \label{tab:trace_devices}
+\end{table}
+
+Events that have been recorded by the trace device are sent off-chip via a
+dedicated trace interface.  If the bandwidth provided by an interface is lower
+than the transfer rate of created events continuous tracing is not possible.
+However, this use case is often required.  There are two ways two solve this
+problem.  The amount of created trace data can be reduced using filters or the
+available bandwidth can be increased.  If an entire application must be
+analyzed as a whole the first way is not an option.
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{r|l c}
+    Interface  & Pros/Cons & DAQ rate \small{$[MB/s]$}\\
+    \hline
+    JTAG           &
+    \begin{tabular}[x]{@{}l@{}}
+      $+$ Reuse of existing interface \\
+      $+$ Small chip area \\
+      $-$ Low bandwidth \\
+      \vspace{1mm}
+    \end{tabular}  &
+    1.2            \\
+    DAP2/SWD   &
+    \begin{tabular}[x]{@{}l@{}}
+      $+$ High bandwidth with few pins \\
+      $+$ Small silicon area \\
+      $-$ Proprietary \\
+      \vspace{1mm}
+    \end{tabular}  &
+    10   \\
+    \gls{agbt} &
+    \begin{tabular}[x]{@{}l@{}}
+      $+$ Very high bandwidth with few pins \\
+      $-$ Large silicon area \\
+      $-$ High cost \\
+      \vspace{1mm}
+    \end{tabular}  &
+    30   \\
+    CAN        &
+    \begin{tabular}[x]{@{}l@{}}
+      $+$ Robust and well known standard \\
+      $+$ Low cost \\
+      $-$ Very low bandwidth \\
+    \end{tabular}  &
+    0.05 \\
+  \end{tabular}
+  \caption[Trace interfaces]{Commonly used trace interfaces and their \gls{daq}
+  (\glsdesc{daq}) rates. \gls{agbt} (\glsdesc{agbt}) is the only interface
+  capable of recording continuous hardware traces of a complete system.}
+  \label{tab:interfaces}
+\end{table}
+
+Mayer et al.\ \cite{interfaces} give an overview of trace interfaces used in
+the automotive industry as shown in \autoref{tab:interfaces}.  \gls{jtag}
+(\glsdesc{jtag}) is a common debug standard \cite{ieee5001}, suitable for
+regular debugging.  It can be used to read out a buffered traced post tracing,
+but for continuous tracing it is not sufficient due to its low bandwidth of
+\unit[1.2]{MB/s}.  Because of that DAP and DAP2 were developed by Infineon and
+SWD by ARM\@.  Both protocols are based on \gls{jtag} but use a higher
+frequency and improved communication protocols to provided more bandwidth.
+
+\gls{agbt} is currently the fastest trace interface.  It was specified by
+XILINX and adopted by the Nexus standard.  \gls{agbt} is the only interface
+which is theoretically capable of recording a continuous trace of a complete
+application running on a processor with a frequency of \unit[200]{MHz}.  CAN is
+used by some hybrid trace tools but is only mentioned for completeness since
+its bandwidth is too low to be considered for hardware tracing.
+
+\begin{figure}[]
+ \centering
+ \includegraphics[width=\textwidth]{./media/trace/workbench.png}
+ \caption[Trace workbench]{A complete trace workbench.  An Infineon TriCore
+ evaluation board (a) can be traced by the iSYSTEM iC6000 (b) or the Lauterbach
+ PowerTrace-2 (e) via the highspeed \gls{agbt} interface.  Host software is
+ used to control the hardware and to analyze the recorded trace, for example
+ WinIDEA (c) by iSYSTEM and TRACE32 (d) by Lauterbach \cite{maxmaster}.}
+ \label{fig:workbench}
+\end{figure}
+
+Target access hardware is connected to the hardware interface to readout
+recorded trace events.  From the target access hardware the data is transmitted
+to a host computer for further analysis via USB 3.0 or Ethernet.  Examples for
+target access hardware are the iC6000 by iSYSTEM \cite{ic6000}
+(\autoref{fig:workbench} b) and the PowerTrace-II by Lauterbach
+\cite{powertrace2} (\autoref{fig:workbench} e).  Both devices support
+different architectures and trace interfaces by using architecture specific
+debug cables.  Besides reading hardware traces those devices also support all
+functionalities provided by a regular debugger such as step wise debugging,
+reading of memory content, and manipulation of CPU configuration registers.
+
+Dedicated software on the host computer is used to configure and control the
+target access hardware and the trace device itself.  After recording, this
+software transforms the recorded hardware trace into a software trace (see
+\autoref{fig:trace_event_levels}).  For this process the host software must
+have access to the \gls{elf} file of an application.  This is required to map
+the addresses of hardware trace events to the corresponding software entities.
+Based on the software trace, different analysis techniques such as metric
+evaluation, performance analysis, and code coverage are supported.  Gantt
+charts are provided to examine the trace visually.  Via export functions a
+software level program flow and data trace can be made available for external
+tools.  \autoref{fig:workbench} shows the toolchain described in this section.
@@ -0,0 +1,302 @@
+\chapter{Introduction}
+\label{chapter:Introduction}
+
+Embedded applications are increasingly required to provide real-time
+performance \cite{hopkins2006debug}.  This means that the correct behavior of a
+system is not only dependent on the logical results of a computation, but also
+on the physical instant in which these are produced \cite{kopetz2011real}.  For
+hard real-time applications violation of a deadline will result in damage of
+the system or its environment \cite{tokuda1990real}.
+
+Due to the pervasive nature of embedded systems and their use for critical
+applications, e.g., medical devices or advanced driver assistance systems,
+measures to ensure the correctness of time dependent functionality must be
+taken \cite{konrad2005real}.  Therefore, debugging and validation are a
+fundamental part of the development process of such applications
+\cite{dixon2013advantages}.
+
+Different techniques to debug embedded systems exist \cite{schneider2004ten}.
+The simplest one is a classical \lstinline{printf} statement in C (or the
+equivalent in another language).  More sophisticated debug technologies require
+on-chip debug logic in the embedded processor.  On-chip debug generally
+supports two different types of functionality: run-control debug and real-time
+trace \cite{dixon2013advantages}.
+
+The former allows it to stop and examine the state of a system at points of
+interest, so called breakpoints.  This approach is intrusive, or in other words
+changes the runtime behavior of an application.  This is not acceptable for
+time critical applications, e.g., engine control units that require continuous
+execution of the processor in order to control feedback loops and to maintain
+mechanical stability \cite{dixon2013advantages}.
+
+Real-time trace recording or tracing however, allows it to analyze and debug
+a system without stopping the execution.  It works by recording processor
+events such as function calls and data accesses.  The captured events can be
+used to reconstruct and analyze the runtime behavior of an application.
+
+Since timing is an integral part in the development of safe and secure
+real-time applications, timing dependencies should be included in the software
+interface specifications \cite{lutz1993analyzing}.  One way to specify these
+dependencies are timing requirements, e.g., the maximum response time for a
+certain task \cite{deubzer2011robust}.  Via tracing system engineers are
+capable of validating those requirements on target.
+
+\glsdesc{ta} (\gls{ta}) provides the \gls{ta} Tool Suite, a collection of tools
+for the system design, simulation, automated optimization, and target
+verification of embedded real-time multi-core and many-core systems
+\cite{tatoolsuite}.  These features work on the basis of system models.
+Consequently, requirements are defined for system entities such as tasks,
+runnables, signals, and semaphores.
+
+On the contrary, trace recording produces events on software level.  This means
+a trace contains information about function entries and exits, and data read
+and write accesses. As a consequence, the specified system requirements cannot
+be evaluated.
+
+However, by mapping software events to the corresponding system events it is
+possible to transform a software to a system level trace.  \glsdesc{btf}
+(\gls{btf}) is a trace format on system level and is used in this thesis
+because of its native support for multi-core environments.  To the best of my
+knowledge, the possibility of a software to system mapping has only been shown
+for a small subset of all entities specified by \gls{btf}.
+
+In this thesis the feasibility of mapping all event actions contained in the
+\gls{btf} standard is discussed, evaluated and validated.  Furthermore,
+different real-time trace techniques are discussed with respect to their
+versatility for the timing analysis of embedded multi-core real-time
+applications.
+
+\section{Motivation}
+\label{section:motivation}
+
+Transformation of software events to system events is required for the timing
+analysis of embedded real-time systems as discussed in the previous section.
+Moreover, system traces can also be used for different other use cases which
+are covered in the following.
+
+\subsubsection{Simulation Validation}
+
+A simulation can be executed for a timing model by the TA Simulator.  The
+resulting simulated trace can be evaluated to validate the compliance of an
+application with the specified requirements.
+
+A simulated and a hardware based system trace will never be equal by
+definition because a model is an abstraction of reality.  Nevertheless,
+simulation supports engineers in validating system behavior in early design
+stages.  It can abstract complex problems and analyze non-deterministic system
+behavior \cite{sifakis2003building}.
+
+However, a simulation is still a software which is vulnerable to bugs and can
+potentially produce wrong results.  A deviation to reality due to the
+abstraction cannot be classified as a wrong result, on the other hand an
+implementation error can be.
+
+Via tracing it is possible to validate the correctness of simulated traces.
+This is especially useful if a new simulation feature is implemented.  In this
+case a system trace recorded from hardware can provide valuable insights in the
+actual behavior.
+
+\subsubsection{OS Overhead Measurement}
+
+Another aspect that is relevant for the development of embedded applications is
+the overhead caused by the operating system (\gls{os})
+\cite{zeng2011mechanisms}.  Overheads are execution periods where the processor
+is not used by the actual application but by the \gls{os}  for example,
+context switches and inter-core communication mechanisms.
+
+Especially for applications with a high processor utilization the additional
+overhead caused by the \gls{os} plays a critical role.  Fulfillment of timing
+requirements may be feasible or not depending on the overhead
+\cite{maxmaster}.  In order to take this into consideration a good
+understanding of the execution times required by \gls{os} routines is
+necessary.  System traces recorded on hardware allow it to determine the exact
+execution times for these overheads easily.
+
+\subsubsection{Model Reconstruction}
+
+The initial creation of a timing model for an existing application is a tedious
+process if it must be done manually.  Model reconstruction can simplify this
+task by creating a timing model automatically.  It works by analyzing a
+system trace recorded from hardware.  By detecting common timing patterns in
+the trace a model of the application can be created
+\cite{sailer2014reconstruction}.
+
+
+\section{Related Work}
+
+The two main topics discussed in this thesis are tracing and hardware to system
+mapping.  While the former has been an important topic in the literature over
+the last three decades, the necessity for the latter has only become important
+in recent years.
+
+\subsubsection{Tracing}
+
+Ferrari \cite{ferrari1978computer} gives an comprehensive overview of major
+computer performance evaluation techniques and their application to various
+types of performance problems.  In his book \emph{Computer Systems Performance
+Evaluation} he distinguishes between three trace measurement techniques:
+software, hybrid, and hardware based trace measurement.  It is important to
+understand that these techniques do not directly relate to the trace
+abstraction levels discussed in the previous sections.  The concepts described
+in his book which was released in 1978 are still relevant today, the
+implementation is outdated.
+
+Mink et al.\ \cite{mink1989performance} discuss hardware based performance
+measurement in more detail.  They argue that hardware tracing is the only
+sufficient trace technique for recording resource utilization information
+because of the high signal speeds involved and the fact that not all signals
+are visible to software measurement techniques.  Resource utilization is
+concerned with detailed information about the operation of the hardware such as
+cache hit ratios and access delays.  Moreover, they mention that software based
+tracing is intrusive and thus changes the runtime characteristics of an
+application.
+
+Kraft et al.\ \cite{kraft2010trace} discuss trace measurement in the context of
+five industrial projects.  They argue that hardware trace solutions require
+large, expensive equipment mainly intended for lab use.  Additionally, they
+claim that software based trace solutions can also remain active in
+applications post-release.  Based on this arguments they use a software based
+trace measurement approach in their paper.  They introduce a software
+instrumentation approach with a very low overhead according to their
+measurement results.
+
+
+\subsubsection{Hardware to System Mapping}
+
+Lauterbach \cite{lauterbach2015third} provides a possibility to export task and
+runnable system events for traces recorded via hardware tracing.  However,
+their approach is limited to a subset of the existing task and runnable events.
+For example, runnable preempt and resume, and task wait events are not covered
+by the Lauterbach export even though this information is relevant for the
+real-time analysis.  Lauterbach uses the information from the \glsdesc{orti}
+(\gls{orti}) files and relies solely on function trace events for the export.
+
+Kraft et al.\ \cite{kraft2010trace} also discuss how task events on system
+level can be recorded.  They argue that it is difficult to detect which entity
+blocks a task because the scheduling status of the \gls{os} only provides
+information about the entity type blocking the task not the entity itself.
+They suggest code instrumentation as a pragmatic solution to work around this
+problem, admitting that this approach is problematic because the
+instrumentation points have to be maintained by the developer.
+
+
+\section{Interrogation}
+\label{section:interrogation}
+
+Timing analysis of embedded system requires a trace, i.e., a sequence of
+events, with sufficient duration and timestamp accuracy.  The minimum trace
+duration is dependent on the application and requirements that should be
+validated.  Fundamentally, the longer the trace duration the more information
+for the real-time analysis of the application are acquired.  However, more data
+requires longer processing times.  Therefore, a trace duration of at least one
+second is demanded in this thesis to provide a tradeoff between processing time
+and sufficient length for the real-life use-cases discussed in
+\autoref{section:motivation}.
+
+Timestamp accuracy is important for the real-time analysis because if the
+resolution is too low no meaningful analysis may be feasible.  For example, if
+events can only be recorded in the range of milliseconds, the analysis of
+requirements in the microseconds range is not feasible.
+
+Kraft et al. \cite{kraft2010trace} also state that a timestamp accuracy in the
+milliseconds range is too coarse-grained for embedded systems timing analysis.
+Especially for validation of simulation tools and model reconstruction cycle
+accurate timestamps would provide enormous benefits.  From these requirements
+the first hypothesis that should be evaluated in this thesis can be derived.
+
+\begin{hyp}
+\label{hyp:1}
+There exists a trace technique that allows recording of cycle accurate traces
+for embedded multi-core real-time system with a duration of at least one
+second.
+\end{hyp}
+
+Trace techniques output a trace on software level, i.e., a sequence of software
+events.  These events provide information about the code segments executed by
+an application and the memory regions accessed.  This information allows deep
+insights into the runtime behavior of an embedded system, but is not sufficient
+for its real-time analysis.
+
+Traces on system level or in other words, sequences of system events are
+required for the real-time analysis of embedded multi-core applications.  In
+the context of this thesis system events are defined as all events that are
+contained in the \gls{btf} specification and not explicitly excluded in
+\autoref{subsection:btf_entity_types}.  With an understanding of the underlying
+\gls{os} mechanisms it may be possible to map software to system events.
+
+\gls{osek} and \gls{autosar} are common standards for the development of
+applications in the automotive industry.  These standards are discussed in more
+detail later.  \gls{osek} compliant operating systems feature a so-called
+\gls{orti} file.
+
+The aim of \gls{orti} is to make \gls{os} internal data visible to external
+tools \cite{osekortia}.  This means it is possible via \gls{orti} to relate
+software level entities to their respective interpretation on system level.  It
+must be examined if a mapping for all \gls{btf} entities is feasible.
+
+\begin{hyp}
+\label{hyp:2}
+A complete mapping from software to system entities is feasible based on the
+information included in the \gls{orti} file for an \gls{osek} compliant
+\glsdesc{os}.
+\end{hyp}
+
+If Hypothesis~\ref{hyp:2} does not hold other ways to achieve a complete
+software to system mapping must be found.  An \gls{os} must keep track of the
+states of all relevant system objects internally.  Otherwise, it would not be
+possible to execute appropriate actions if required.  For example, if one task
+activates another one the \gls{os} must determine whether the corresponding
+task is allowed to be activated or if the maximum number of activations has
+already been exceeded.
+
+By analyzing the internal data structures of an \gls{os} it may be possible to
+construct a mapping from software to system entities.  Considering the previous
+example, there might be an \gls{os} data structure that keeps track of the
+remaining activations for each task entity.  If the field for a task is
+incremented, an entity of the corresponding task terminates. If it is
+decremented a new task instance is activated.
+
+\begin{hyp}
+\label{hyp:3}
+A complete mapping from software to system entities is feasible for an
+\gls{osek} compliant \glsdesc{os}.
+\end{hyp}
+
+
+\section{Outline}
+
+In oder to transform a trace recorded from hardware to a trace on system level
+an understanding of the underlying operating system mechanisms is required.  An
+\gls{os} standard commonly used in the automotive industry is \gls{osekos}.  It
+is discussed in \autoref{section:osekvdxos}.
+
+The real-time behavior of an embedded multi-core application can be represented
+by a system trace.  Based on a system trace an application can be examined and
+specified timing requirements can be validated.  \gls{btf} is a system level
+trace format and used in this thesis.  It is discussed in
+\autoref{chapter:btf}.
+
+There exist different techniques to record traces of embedded applications.  In
+\autoref{section:trace_measurement} an overview of these techniques is
+provided.  It is then argued why hardware tracing is the only technique
+sufficient for the validation of embedded real-time applications.  Accordingly,
+hardware tracing is then discussed in more detail.
+
+On the basis of the information in \autoref{chapter:fundamentals} the mapping
+between software entities and system entities is described in
+\autoref{chapter:mapping}.  Mapping is done for all \gls{btf} entities that are
+relevant for the analysis of embedded multi-core applications as discussed in
+\autoref{subsection:btf_entity_types}.
+
+In \autoref{chapter:validation} the mapping is validated.  For that reason
+criteria to compare \gls{btf} traces are established in
+\autoref{subsection:validation_techniques}.  Based on these criteria simulated
+traces and traces recorded hardware are compared and evaluated.  This is done
+in two steps.  Firstly, test applications are created manually to cover all
+possible \gls{btf} actions in \autoref{subsection:systematic_tests}.  Secondly,
+applications are created randomly to avoid selection bias in the creation of
+test cases in \autoref{subsection:randomized_tests}.
+
+Finally, the results of this thesis are discussed in
+\autoref{chapter:conclusion} and possible topics for future work are outlined
+in \autoref{chapter:future_work}.
@@ -0,0 +1,927 @@
+\chapter{Mapping}
+\label{chapter:mapping}
+
+% {{{ Mapping Intro
+Systems are analyzable on different levels of abstraction as shown in
+\autoref{fig:trace_event_levels}.  Depending on the use case, one or another
+level is more sufficient to perform the required analysis.  For example, a
+hardware designer does not care about task states while a system engineer is
+usually not interested in voltage levels of transistors in memory.
+
+For the timing analysis of an embedded system a trace on system level is
+required because timing requirements are usually specified for system entities
+such as tasks or signals.  Hence, system level traces contain the information
+necessary to validate an application with respect to its timing behavior.
+
+A trace long enough, so that all relevant entities appear with sufficient
+frequency for the timing analysis, is required.  For example, at least two task
+instances must be activated in one trace to calculate the activate-to-activate
+time.  Additionally, it is important not to influence the timing of an
+application by trace measurement.  Consequently, the only sufficient trace
+technique for the timing analysis of embedded systems is hardware tracing
+according to \autoref{tab:trace_tool_overview}.
+
+\begin{figure}[]
+ \centering
+ \centerline{\includegraphics[width=\textwidth]{./media/mapping/concept_measurement_btf.pdf}}
+ \caption[Hardware to \gls{btf} trace basic idea]{Hardware tracing records
+ events on hardware level.  This is not sufficient for the timing analysis of
+ an embedded system.  Thus, it is necessary to transform the hardware events to
+ system events.  This requires two steps.  In the first step hardware events
+ are transformed to software events.  This step is done by the trace software
+ and requires the application binary.  The next transformation step produces a
+ trace on system level, e.g.\ in the \gls{btf} format.  An \gls{orti} file as
+ well as additional information that can for example, come from a timing
+ model file (\gls{rte}) are required for this step.}
+ \label{fig:mapping_concept}
+\end{figure}
+
+Hardware tracing records events on hardware level.  As stated above this level
+is not sufficient for the timing analysis of an embedded system.  Thus, it is
+necessary to transform hardware events to system events as shown in
+\autoref{fig:mapping_concept}.  Two steps are required for this transformation.
+Hardware level events must be transformed into software level events which are
+then further processed into system level events.
+
+The first step is done by the trace  software.  It is capable of analyzing and
+interpreting the hardware events that are recorded from the processor.  Via the
+application binary files it is possible to map the raw memory addresses
+contained in the hardware events to the corresponding symbols of the real
+application as depicted in \autoref{fig:hardware_software_idea}.
+
+\begin{figure}[]
+ \centering
+ \centerline{\includegraphics[width=\textwidth]{./media/mapping/hardware_software_idea.pdf}}
+ \caption[Hardware event to software event idea]{The trace software is capable
+ of transforming a hardware level event to a software level event.  This
+ involves for example, changing memory addresses with the actual symbol names
+ based on the application binary (\gls{elf} file).  Further actions may be
+ required depending on the trace device.  Note that the displayed hardware
+ event is just a generalization, the actual structure can be different
+ depending on the trace device vendor.}
+ \label{fig:hardware_software_idea}
+\end{figure}
+
+Depending of the trace device further steps may be required.  For example, some
+trace devices produce timestamps, relative to the previous event which must
+then be transformed into absolute timestamps.  Another example are program flow
+traces.  Hardware level program flow events are usually only recorded for
+instructions that change the flow of an application as described in
+\autoref{subsection:hardware_tracing}.  Only with the application binary it is
+possible for the software to reconstruct a complete program flow trace.
+
+Based on the software level trace, a system level trace can be generated in the
+next step.  A suitable system level trace format is \gls{btf} which is
+described in \autoref{chapter:btf}.  It is capable of representing the behavior
+of an application in a way that is eligible for its timing analysis.  Different
+additional information, e.g.\ the \gls{orti} file is required to
+execute the transformation from software to system level trace.
+
+% }}}
+% {{{ Mapping Proceeding
+\section{Mapping Proceedings}
+
+Transformation from hardware to software level is done by the trace software.
+Corresponding to the composition of an on-chip trace device it creates two
+types of traces on software level: a data trace and a function trace.
+
+Let $i$ be an index in $\mathbb{N}_{0}$ denoting an individual event
+occurrence.  Then a data event can be defined as an octuple
+
+\begin{equation}
+  \label{eq:data_event}
+  d_{i} = (t_i, \pi_i, a_i, v_i, c_i)
+\end{equation}
+
+where $t_i \in \mathbb{N}_{0}$ is the timestamp in nanoseconds, $\pi_i$ is the
+name of the accessed variable, $a_i \in \{R, W\}$ is the way in which the
+variable is accessed either $R$ for read or $W$ for write, $v_i \in
+\mathbb{N}$ is the value that was read or written and $c_i$ is the core name
+on which the access has occurred.
+
+Consequently, a data trace can be defined as a sequence of data events where
+$n \in \mathbb{N}_{0}$ is the number of events in the trace.
+
+\begin{equation}
+  \label{eq:data_trace}
+  D = (d_1, d_2, \dots, d_n)
+\end{equation}
+
+Let $j$ be an index in $\mathbb{N}_{0}$ denoting an individual event
+occurrence.  Then a function event can be defined as a quadruple
+
+\begin{equation}
+  \label{eq:function_event}
+  f_j = (t_j, \pi_j, \theta_j, c_j)
+\end{equation}
+
+where $t_j \in \mathbb{N}_{0}$ is the timestamp in nanoseconds, $\pi_j$ is the
+name of the accessed function, $\theta_j \in \{A, \Omega\}$ indicates
+whether the function has started ($A$) or terminated ($\Omega$), and $c_j$
+is the core name on which the function event has occurred.
+
+After that a function trace can be defined as a sequence of function events
+where $m \in \mathbb{N}_{0}$ is the numbers of events in the trace.
+
+\begin{equation}
+  \label{eq:function_trace}
+  F = (f_1, f_2, \dots, f_m)
+\end{equation}
+
+Based on \autoref{eq:btf_trace}, \autoref{eq:data_trace}, and
+\autoref{eq:function_trace} the goal is to describe a function $g$ so that
+
+\begin{equation}
+  g: (D,\, F) \rightarrow B,
+\end{equation}
+
+where the timestamps $t$ of the events in $D$, $F$, and $B$ are relative to the
+same point in time.  However, $D$ and $F$ alone are not sufficient for the
+transformation from software to hardware level because of three reasons.
+
+Firstly, the events on software level do not provide enough information to
+decide which variable maps to a certain entity on system level.  For example,
+the state of each task is stored in a certain variable.  Whenever the state
+changes, this variable changes too and a data event is generated.  However,
+the transformation function does not know that the variable maps to the state
+of a task.  Because of that the \gls{orti} file described in
+\autoref{subsection:osek_oil_and_orti} is required.  Via this file it is
+possible to relate variables to the corresponding system objects.
+
+Secondly, not all entity types specified by \gls{btf} for example,
+runnables and signals are included in the \gls{orti} file.  The former are
+included in the function trace, the latter in the data trace.  But if the
+transformation function is not able to distinguish regular functions from
+runnables and regular variables from signals this information cannot be used.
+Thus, it is necessary to provide a list of those entities to the transformation
+function.
+
+Finally, it is necessary to keep track of the internal state of an application.
+If the \gls{orti} file is available it can be detected that a certain task has
+changed its state.  Consequently, a \gls{btf} event must be generated.  Without
+the knowledge about the previous task state however, it is not possible to
+decide which task action has occurred.  If the task changes into the running
+state, this could mean that the task has started for the first time resumed
+from ready state or continued to run after polling a resource.
+
+Because of this reasons the function $g$ must be redefined as
+
+\begin{equation}
+  g': (D,\, F,\, o,\, l,\, S) \rightarrow (B,\, S')
+\end{equation}
+
+where $o$ is the \gls{orti} file of the traced application, $l = (l_r,\, l_s)$
+is a tuple that contains a list of runnables $l_r$ and a list of signal names
+$l_s$, and $S$ and $S'$ are the system states before and after the
+transformation.  The information must be part of the system state $S$ is
+discussed in the next sections.
+
+% }}}
+% {{{ ORTI Mapping
+\section{ORTI Mappings}
+
+\textbf{Task} entities are capable of executing twelve actions according to
+\autoref{fig:process_state_chart} plus the additional notification event if the
+\gls{mta} limit is exceeded.  The lifecycle of a task entity starts with its
+activation.
+
+An \textbf{activation} can be detected via the \gls{orti} \emph{task status}
+attribute.  If no other task instance of the same task entity is active in the
+system, a task whose state changes to ready is activated.  However, this does
+not work if a task instance of the same task is already active in the system.
+This can happen if multiple task activations are allowed by the
+\glsdesc{osekcc}.  In case of a \gls{mta} the corresponding \gls{osek}
+\emph{task status} attribute already indicates an active state (any state that
+is not suspended) and will not change to ready again.
+
+Consequently, another way to detect task activations is required.  Via the
+\gls{orti} \emph{currentactivations} attribute, the number of open activations
+for each task can be detected.  Whenever this attribute is incremented, a new
+task activation \gls{btf} event must be created.  Therefore, it is necessary to
+keep track of the number of activations for each task entity in the system.
+Only if the previous number of activations for a task is known, it is possible
+to decide whether the value is incremented or decremented when a new data write
+event occurs.  Thus, the number of current activations for each process is a
+relevant information and must be part of the system state $s$.
+
+Since tasks have a lifecycle it is necessary to keep track of the instances for
+each task entity.  Whenever a new task is activated the instance counter must
+be incremented and the counter value is assigned to the task.  The same
+procedure is necessary for all other entities that have a lifecycle.  The
+latest instance counter value for each entity must be available in the system
+state $s$ to create correct \gls{btf} events.  Additionally, it is necessary to
+add newly created tasks to a list of task instances active in the system.  When
+a task's lifecycle ends, i.e., the task terminates, it is removed from this
+list.
+
+A \textbf{stimulus} is required to activate a task.  Stimuli can be
+\textbf{triggered} by process and by simulation entities.  A stimulus triggered
+by another process represents an \glsdesc{ipa} (\gls{ipa}).  An \gls{ipa} is
+implemented via the \lstinline{ActivateTask} service routine.  The \gls{orti}
+\emph{servicetrace} attribute can be used to detect when this routine is
+executed.  Whenever the \lstinline{ActivateTask} routine is entered and a task
+is running on the same core a stimulus event is created with the task as the
+source entity.
+
+Alarms are the second way to activate tasks.  The \emph{alarmtime} attribute
+indicates how many ticks are left until an alarm expires.  The \gls{orti} file
+also contains the action that is executed by an alarm.  Thus, a stimulus can be
+triggered whenever an alarm that activates a task reaches an \emph{alarmtime}
+value of zero.
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{r|l l}
+    Action   & ORTI attribute     & System state \\
+    \hline
+    trigger (ipa)   & servicetrace (ActivateTask) & running task \\
+    trigger (alarm) & alarmtime    & - \\
+  \end{tabular}
+  \caption[Stimulus event mapping]{In \gls{btf}, a stimulus must be triggered
+  so that it can activate a task.  On target a task can be triggered via an
+  \gls{ipa} or by an alarm.  The first can be detected via the
+  \emph{servicetrace} attribute, while the latter is indicated if the
+  \emph{alarmtime} attribute reaches the value zero.}
+  \label{tab:stimulus_mapping}
+\end{table}
+
+A triggered stimulus must be added to the system state.  Later, when the actual
+task activation is executed by the \gls{os} the latest stimulus is removed
+from the system state and used to create a correct \gls{btf} event.
+\autoref{tab:stimulus_mapping} summarizes how stimulus events are detected.
+
+
+A \textbf{task} \textbf{start} event occurs if a task which was previously
+active changes to running.  There are two cases for which preempt and resume
+actions must be created.  The first case is a normal state change that can be
+detected via the \emph{task status} attribute.  A task is \textbf{preempted} if
+the state changes from running to ready and \textbf{resumed} if the state
+changes from ready to running.
+
+However, the task state is not updated by the \gls{os} when a task is preempted
+by an \gls{isr}.  Consequently, a task preempt event must also be created, if
+the \emph{runningisr2} attribute indicates that a new \gls{isr} is running on
+the core and a resume event must follow once the \gls{isr} terminates
+execution.
+
+A task \textbf{terminate} event occurs if a running task changes into the
+suspended state.  The previous state must not be known because a task can only
+be terminated from the running state.
+
+However, there is a special case for task terminate events.  As mentioned in
+\autoref{subsection:osek_architecture}, a task with pending activations
+switches directly into the ready state, after the current instance terminates.
+To work around this problem it is necessary to detect when a certain task
+instance executes the \lstinline{TerminateTask} service routine via the
+\emph{servicetrace} attribute.  If this happens a flag in the system state must
+be set to indicate that the respective task instance has been terminated.
+Whenever a task changes from running to ready this flag must be checked
+to decide whether the corresponding event is a preemption or a termination.
+
+A \textbf{wait} event occurs if a running task waits for an event that is not
+set.  In this case the \gls{os} will change the task state to waiting and the
+task is removed from the core.  A \textbf{release} event occurs once the event is set
+and the \gls{os} changes the task state to ready.
+
+\begin{code}
+\begin{lstlisting}[caption={[Resource polling] The \gls{btf} polling state
+indicates that a process is actively waiting for a resource.  This listing
+shows how this might be impolemented in C.},
+label={listing:resource_polling}]
+TASK(EngineManager) {
+  /* Wait actively until EngineResource becomes available. */
+  while(GetResource(EngineResource) != E_OK);
+  engineRPM = calculateEngineRPM();
+  ReleaseResource(EngineResource);
+
+  TerminateTask();
+}
+\end{lstlisting}
+\end{code}
+
+\textbf{Poll} actions are more difficult to detect, since they are not directly
+related to a concept specified by \gls{osekos}.  The idea of the \gls{btf}
+polling state is to indicate that a task is actively waiting for a resource.
+In code this can be implemented via a loop in which a resource is requested
+repeatedly until it becomes available as shown in
+\autoref{listing:resource_polling}.
+
+Via \emph{servicetrace} and \emph{lasterror}
+it can be detected that a process has requested a locked resource:  the
+\emph{servicetrace} attribute indicates when the \lstinline{GetResource}
+service routine is called and \lstinline{E_OS_RESOURCE} is written to the
+\emph{lasterror} attributed in case the resource is locked.
+
+However, a single request does not necessarily mean that a change into the
+polling state is happening.  Instead a task might just execute one code
+segment, if the resource is available and a different one, if it is not.
+Therefore, it is necessary to set a \emph{previous request} flag for a task
+instance that has requested a locked resource once.  If another request follows
+in the same running interval a poll event is generated.  Once there are no more
+requests, the last request must have been successful and a run event is created
+to indicated the state change from polling to running.  Then the previous
+request flag must be cleared.
+
+A \textbf{park} action must be created if a task that is in polling state is
+changed into the ready state.  Next, it is necessary to detect resource state
+changes of the resource which the parking task has been polling.  If the
+respective resource changes into an unlocked state, a \textbf{release\_parking}
+event is created.  On the other hand, if the resource stays locked and the task
+changes back into running state, a \textbf{poll\_parking} event is required.
+
+The \textbf{mtalimitexceeded} notification event is the last task event that
+must be detected.  This event is created, if a task activation gets triggered,
+but no actual task instance is added to the system.  An \gls{osek} compliant
+\gls{os} writes an \lstinline{E_OS_LIMIT} error into the \emph{lasterror}
+attribute, if a task activation is triggered, but the maximal \gls{mta} value
+is already reached.  To create a valid \gls{btf} event it is necessary to know
+for which task entity the error is created. Since \gls{orti} does not provide
+this information the creation of \emph{mtalimitexceeded} events is not
+feasible.  \autoref{tab:task_mapping} gives an overview of the task mapping.
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{r|l l}
+    Action        & ORTI attribute    & System state \\
+    \hline
+    activate         & currentactivations  & currentactivations, last stimulus \\
+    start            & state (running)     & state (active) \\
+    resume           & state (running)     & state (ready)  \\
+    resume           & runningisr2         & running task         \\
+    preempt          & state (ready)       & task not terminated \\
+    preempt          & runningisr2         & running task \\
+    terminate        & state (suspended)   & active tasks \\
+    terminate        & state (ready)       & task terminated \\
+    wait             & state (waiting)     & - \\
+    release          & state (ready)       & state (waiting) \\
+    poll             & lasterror           & servicetrace, previous request \\
+    run              & servicetrace        & state (polling) \\
+    park             & state (ready)       & state (polling) \\
+    poll\_parking    & state (running)     & state (parking) \\
+    release\_parking & resource state      & state (parking) \\
+    mtalimitexceeded & lasterror           & entity cannot be detected \\
+  \end{tabular}
+  \caption[Task event mapping]{Different pieces of information are required to
+  detect all possible task actions.  The states in the \gls{orti} attributes
+  column are \gls{osek} task states while the states in the system information
+  column are \gls{btf} process states.  The previous state is necessary to
+  create correct events.  For example, a task state change to running could
+  mean a \gls{btf} start, resume or run event.
+
+  For some actions, it is necessary to use multiple approaches to detect them.
+  For example, a task terminate event happens if the \gls{osek} state of
+  changes to suspended.  However, if another entity of the same task is already
+  activated, a change to suspended does not occur.  To catch this case it is
+  necessary to set a \emph{task terminated} attribute for a task instance when
+  it calls the \lstinline{TerminateTask} service routine.}
+  \label{tab:task_mapping}
+\end{table}
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{r|l l}
+    Action           & ORTI attribute     & System state \\
+    \hline
+    activate         & -                  & -            \\
+    start            & runningisr2        & \gls{isr} stack \\
+    resume           & runningisr2        & \gls{isr} stack \\
+    preempt          & runningisr2        & \gls{isr} stack \\
+    terminate        & runningisr2        & \gls{isr} stack \\
+  \end{tabular}
+  \caption[\gls{isr} event mapping]{The \emph{runningisr2} attribute is used to
+  detect basic \gls{isr} actions.  Because \glspl{isr} are not allowed to wait
+  for events, waiting state related actions must not be created.  All other
+  actions can be detected in the same way as for task instances as shown in
+  \autoref{tab:task_mapping}.}
+  \label{tab:isr_mapping}
+\end{table}
+
+\textbf{\glspl{isr}} and tasks share the same \gls{btf} state model.  However,
+\gls{osek} does not specify a detailed state model for \glspl{isr} as it does
+for tasks.  Consequently, the basic process actions activate, start, resume,
+preempt, and terminate are detected differently compared to task actions as
+shown in \autoref{tab:isr_mapping}.  \glspl{isr} are not allowed to wait for
+events. Therefore, waiting related process state transitions must not be
+considered.  The detection of semaphore polling events works equally to task
+events and is therefore not discussed again.
+
+An \glsdesc{isr} is triggered by a hardware interrupt.  This means if the
+hardware detects a certain condition, e.g., an \gls{io} pin state changes from
+high to low, the program flow is interrupted and a certain code section that is
+mapped to this interrupt is executed.  Depending on the trace device, it may or
+may not be feasible to detect the activation of an interrupt via the
+corresponding \gls{isr} control register.
+
+In the former case, it is possible to create a stimulus and the resulting
+activate event by detecting when the interrupt activate bit is set in the
+corresponding control register.  Otherwise, the \textbf{activate} event must be
+created when the \gls{isr} changes into the running state for the first time.
+In this case trigger, activate, and start event are all created with the same
+timestamp.
+
+\begin{figure}[]
+ \centering
+ \centerline{\includegraphics[width=\textwidth]{./media/mapping/isr_stacking.pdf}}
+ \caption[Running \gls{isr} stacking]{A stack can be used to track the active
+ \glspl{isr} in a system.  This is necessary to create appropriate \gls{btf}
+ events. For example, the event when \emph{isr\_foo} is set as the running
+ \gls{isr}, is different, depending on the current state of the stack.  If the
+ \gls{isr} is already on the stack, a resume event must be created, otherwise a
+ start event.}
+ \label{fig:isr_stacking}
+\end{figure}
+
+The currently running category two \gls{isr} is indicated by the
+\emph{runninngisr2} \gls{orti} attribute.  Each \gls{isr} has an unique
+\gls{id} that is written into the variable, if the respective entity is
+running. Otherwise runningisr2 is zero which indicates that no \gls{isr} is
+active.  Mapping from \gls{id} to name is included in the \gls{orti} file.  If
+\emph{runningisr2} changes to the \gls{id} of a certain \gls{isr}, it is not
+possible to decide whether this instance runs for the first time or whether it
+is resumed, after it has been preempted by an \gls{isr} with higher priority as
+shown in \autoref{fig:isr_stacking}.
+
+Therefore, it is necessary to keep track of the active \gls{isr} instances in
+the system, e.g.\ via a stack.  Whenever the value of \emph{runningisr2}
+changes it is checked whether the corresponding \gls{id} is already on the
+stack.  If so, the \gls{isr} was already running and has been
+\textbf{preempted}.  Consequently, the \gls{isr} that caused the preemption has
+terminated and must be popped off the stack.  The \gls{isr} that has been
+preempted must be \textbf{resumed}.
+
+The other case is that the new \gls{isr} has not been running yet, i.e.\ is not
+on the stack.  This means that the \gls{isr} on top of the stack, if there is
+one gets \textbf{preempted} and the new \gls{isr} is \textbf{started} and
+pushed on the stack.  If \emph{runningisr2} becomes zero the last \gls{isr} is
+popped of the stack and \textbf{terminated}.
+
+As the name indicates, \emph{runningisr2} is only written for category two
+interrupt routines.  Regular \glspl{isr} are not managed by the \gls{os} and
+therefore not detectable via \gls{orti} attributes.  Instead function trace
+must be utilized to detect when a category one \gls{isr} is started or
+terminated.  To map the function names to actual \gls{isr} entities, a list of
+category one \glspl{isr} is required.  If such a list is available, the
+proceeding is the same as described above.
+
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{r|l l}
+    Action           & ORTI attribute  & System state \\
+    \hline
+    start         & -                  & running process   \\
+    terminate     & -                  & running process   \\
+    suspend       & task state         & running process, process runnables \\
+    resume        & task state         & running process, process runnables \\
+  \end{tabular}
+  \caption[Runnable event mapping]{Runnable start and stop events can be
+  detected via function tracing.  The source entity for a runnable event is the
+  process in whose context the runnable is executed.  A runnable is suspended
+  when the corresponding process is preempted.  If the process resumes, the
+  runnable is resumed, too.
+
+  One runnable can be called in the context of another runnable.  This means
+  multiple runnables can be running within the same process context at the same
+  point in time.  If this is the case, all running runnables must be suspended
+  and resumed.}
+  \label{tab:runnable_mapping}
+\end{table}
+
+\textbf{Runnable} actions are detectable via function events.  Start and
+terminate events must be created for function entry and function exit events.
+A program flow trace contains the information about all functions in the
+system.  A list of runnable entity names is thus required to check whether a
+function is a runnable or not.
+
+Suspend events must be created, if the process context in which a runnable is
+running is preempted and a resume event is required if the corresponding
+process resumes.  This means that whenever a process is deallocated, a
+potentially active runnable must be suspended.  Once the process is
+reallocated the runnable also resumes.
+
+Additionally, runnables can be nested, i.e.\ one runnable can be executed by
+another runnable.  If this happens it is important to suspend and resume all
+running runnables, if the corresponding process is preempted and resumed.
+
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{r|l l}
+    Action           & ORTI attribute    & System state \\
+    \hline
+    write            & -              & running process \\
+    read             & -              & running process \\
+  \end{tabular}
+  \caption[Signal event mapping]{Signals can be read or written.  To create
+  valid \gls{btf} signal events, it is necessary to know which process is
+  currently running on the core, i.e., which process executed the read or
+  write.}
+  \label{tab:signal_mapping}
+\end{table}
+
+\textbf{Signal} events are detectable via data events.  To decide which data
+event corresponds to a signal event a list of signal names must be available.
+With this list it can be decided if a certain data event results in a signal
+event or not.  The source entity for signal read events is the currently
+running process as shown in \autoref{tab:signal_mapping}.  If no process is
+running an entity of type simulation can be used to set the value of the
+signal.
+
+\textbf{Event} actions are easily detectable via the \emph{servicetrace}
+attribute.  Via this attribute it is possible to create set, wait, and clear
+event actions.  However, in order to create valid event actions, it is also
+necessary to know the event entity that relates to the respective action.
+\gls{orti} does not specify event related attributes.  Because \gls{orti} does
+not specify OS event related attributes, it is not possible to create valid
+actions for this entity type.
+
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{r|l l}
+    Action           & ORTI attribute    & System state \\
+    \hline
+    ready            & resource object   & - \\
+    lock             & resource locker   & - \\
+    unlock           & resource locker   & - \\
+    full             & resource locked, servicetrace      & -    \\
+    overfull         & resource locked, servicetrace      & -    \\
+  \end{tabular}
+  \caption[Resource event mapping]{\gls{osek} resources can only be locked or
+  unlocked which means they do not support all semaphore actions.  Lock and
+  unlock actions can be detected via the \gls{orti} locker attribute.
+
+  Full and overfull events are created if an already locked resource is
+  requested again.  This is detectable via the \emph{servicetrace} attribute.
+  The resource for which the \emph{resource locked} attributed was read the
+  last time is the resource for which the error has occurred.}
+  \label{tab:resource_mapping}
+\end{table}
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{r l l}
+    Action & ORTI attribute & System state \\
+    \hline
+    requestsemaphore       & resource locker  & - \\
+    assigned               & resource locker  & - \\
+    waiting                & resource locked  & - \\
+    released               & resource locker  & previous locker \\
+  \end{tabular}
+  \caption[Semaphore process event mapping]{Via the resource locker attribute
+  it is possible to detect if a resource has successfully requested a
+  semaphore.
+
+  The \emph{resource locker} attribute changes to the no task \gls{id} if the
+  resource is no longer locked.  For this case it is necessary to know the task
+  that has previously locked the resource in order to create the correct
+  release event.
+
+  Waiting actions can be created by detecting data read events to the
+  \emph{resource locked} attribute.}
+  \label{tab:semaphore_process_mapping}
+\end{table}
+
+\textbf{Resource} entities must be initialized via the ready action before they
+can be used in a \gls{btf} trace.  This can be done at the beginning of a trace
+with the timestamp zero.  The \gls{orti} file contains a list of all resource
+objects that are part of the application.
+
+Since resources can only be locked or unlocked, they cannot change into the
+semaphore used state.  Consequently, only the state transition actions shown in
+\autoref{tab:resource_mapping} can occur for resource events.  Additionally,
+only a subset of the process semaphore actions are required to represent the
+behavior of resources.
+
+Via the \gls{orti} \emph{resource locker} attribute it is possible to detect by
+which task entity a resource is locked.  This means a lock event can be
+generated whenever the \gls{id} of a certain task is written to this attribute.
+On the other hand, an unlock event is created when \emph{resource locker}
+indicates that the respective entity is currently not locked by any task.
+Moreover, it is necessary to assign a process to the locked resource once it
+is locked by the task and to release it when the resource is released as
+shown in \autoref{tab:semaphore_process_mapping}.
+
+Full and overfull actions are created when a locked resource is polled by a
+process.  The semaphore waiting action is used to indicated the identity of the
+polling process.  As shown above, it is possible to detect whether a process is
+polling a resource via the \emph{servicetrace} and \emph{lasterror} \gls{orti}
+attributes. \emph{Lasterror} is set to \lstinline{E_OS_ACCESS} in case a
+resource is already locked.  The resource for which the polling occurs is
+detectable via the \emph{resource locked} attribute.  Whenever a certain
+resource is requested the \gls{os} will read this attribute to decide whether
+a request is allowed or not.
+
+% }}}
+% {{{ OS Specific Mapping
+\section{OS Specific Mappings}
+
+It is not feasible to create all \gls{btf} events relying solely on the
+\gls{orti} file.  For example, it is necessary to have a list of runnable and
+signal names in order to create valid events for those entity types.  But even
+for entities that are supported by the \gls{orti} interface not all events
+can be generated.  It is  possible to detect if the activation limit
+of a task is exceeded however, it is not possible to determine for which task
+entity this happens.
+
+Nevertheless, even though not all events are detectable via \gls{orti} alone,
+an \gls{osekos} stores the information of interest internally.  During a task
+activation the \gls{os} must decide whether the \gls{mta} limit is reached or
+not.  To do so it is necessary to compare the current amount of
+pending activations to the value of maximal allowed activations.  Consequently,
+the \gls{os} has to read certain information from memory which results in data
+trace events.
+
+Based on this argument all other events can be reconstructed, if the
+corresponding \gls{os} specific operations are known.  On the downside, it is
+no longer possible to rely on a standardized interface like \gls{orti}.  This
+means the algorithm that does the transformation must be customized depending
+on the \gls{os}.  In this section the adaptations required to create a
+\gls{btf} trace for the \gls{osek} compliant Erika Enterprise (\gls{ee})
+\glsdesc{os} \cite{erika} are shown.  In
+\autoref{section:evaluation_test_bench} the reasons for choosing \gls{ee} are
+discussed.
+
+\textbf{Task} \emph{mtalimitexceeded} events cannot be created based on
+\gls{orti} alone because the task entity for which the event occurs is not
+detectable.  One way to get this information is to remember which task's
+\emph{currentactivations} attribute was read the last time. The \gls{os} has to
+decide whether a task instance can be created once an activation is triggered.
+To do so it compares the maximum allowed activations with the current number
+of activations of a task. In other words, the \gls{os} reads the
+\emph{currentactivations} attribute for the task that should be activated.  If
+the \gls{mta} limit is exceeded an error code is written.
+
+\begin{figure}[]
+ \centering
+ \centerline{\includegraphics[width=\textwidth]{./media/mapping/mtalimitexceeded.pdf}}
+ \caption[Call stack for inter-core process activation]{A
+ \emph{mtalimitexceeded} event must be created if the \lstinline{E_OS_LIMIT}
+ error is set via the \emph{lasterror} \gls{orti} attribute.  However, this is not
+ correct for Erika Enterprise multi-core applications.  For a failing
+ inter-core inter-process activation the error code is written two times, once
+ on the source and once on the target core.  Therefore, special care must be
+ taken, so that the \gls{btf} event is created only once.}
+ \label{fig:mtalimitexceeded}
+\end{figure}
+
+\begin{code}
+\begin{lstlisting}[caption={[Task activations limit exceeded] Erika Enterprise
+keeps track of the remaining activations that are allowed for a task entity.
+If the value is zero and another activation occurs an \lstinline{E_OS_LIMIT}
+error is set.}, label={listing:mtalimitexceeded}]
+if ( EE_th_rnact[TaskID] == 0U ) {
+  ev = E_OS_LIMIT;
+} else {
+  /* Do activation. Code removed for clarity. */
+  ev = E_OK;
+}
+if (ev != E_OK ) {
+  EE_ORTI_set_lasterror(ev);
+  EE_oo_notify_error_ActivateTask(TaskID, ev);
+}
+\end{lstlisting}
+\end{code}
+
+As it turns out this approach is not sufficient for multi-core systems.
+Activation of a task entity by a task on another core via
+\lstinline{ActivateTask} is implemented by a \glsdesc{rpc} (\gls{rpc}) as shown
+in \autoref{fig:mtalimitexceeded}.  The \gls{rpc} triggers an \gls{isr} on the
+other core which performs the required action.  In case of an inter-process
+activation the \lstinline{ActivateTask} routine is executed again, but this
+time on the core the target task is allocated to.  If the \gls{mta} limit of
+the task is exceeded an \lstinline{E_OS_LIMIT} error event is written and a
+\lstinline{mtalimitexceed} event is created.
+
+However, the remote procedure call is notified by the remote \gls{isr} once the
+service routine has finished.  The corresponding error code is also returned
+back to the initial core and written to the \emph{lasterror} attribute.  The
+resulting problem is that the transformation algorithm would create another
+\emph{mtalimitexceeded} event based on the last read from the pending
+activations variable on the initial core which is not correct.
+
+A way to work around this problem can be derived by looking at a part of the
+source code of the \lstinline{ActivateTask} implementation shown in
+\autoref{listing:mtalimitexceeded}.  It shows that \gls{ee} keeps track of the
+remaining activations of each task in an array called \lstinline{EE_th_rnact}.
+If the field for a specific task becomes zero, an \lstinline{E_OS_LIMIT} error
+is written.  This means if a task should be activated on one core and this
+activation fails due to too many pending activations this will become clear by
+a data read event to \lstinline{EE_th_rnact} directly followed by a write event
+to the \emph{lasterror} attribute.  For a remote activation there are multiple
+other data events between the error and the previous read to
+\lstinline{EE_th_rnact}.  Therefore, no incorrect \emph{mtalimitexceeded} event
+is created.
+
+\begin{figure}[]
+ \centering
+ \centerline{\includegraphics[width=\textwidth]{./media/mapping/deltaqueue.pdf}}
+ \caption[Alarm delta queue implementation]{\gls{ee} implements alarms
+ via a delta queue.  There is one queue, containing of the corresponding
+ alarms, for each counter.  Each alarm has a delta value that indicates after
+ how many ticks in relation to the previous alarm it must be executed.  Only
+ the delta of the first alarm in the queue must be decremented for each counter
+ tick.  If an alarm expires it is removed from the queue, and inserted again in
+ case it is cyclic.
+
+ In this example Alarm 2 expires after three ticks.  Since Alarm 5 has a
+ delta of zero it expires at the same counter cycle.  Alarm 4 expires after
+ six cycles, i.e.\ the sum of its own and all previous deltas.
+ }
+ \label{fig:deltaqueue}
+\end{figure}
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{r|l l}
+    Action           & Variable          & Additional Information \\
+    \hline
+    mtalimitexceeded & lasterror         & previous data read event \\
+    trigger (alarm)  & alarm action type & \gls{orti} \\
+  \end{tabular}
+  \caption[OS task and stimulus event mapping]{Via \gls{orti} it is not
+  possible to detect for which task an \lstinline{E_OS_LIMIT} event has been
+  created.  However, the data read event before this error can be used to get
+  this information.
+
+  Additionally, alarm trigger events cannot be created via the \emph{alarmtime}
+  attribute in Erika Enterprise, because it is not implemented in an \gls{osek}
+  compliant way.  Instead, read events to the \lstinline{ActionType} attribute
+  of an alarm can be used to detect when a stimulus event must be created.}
+  \label{tab:task_mapping_os}
+\end{table}
+
+\textbf{Stimulus} events must be created for inter-process and alarm
+activations as shown in \autoref{tab:stimulus_mapping}.  An alarm activation
+stimulus is created if the \gls{orti} \emph{alarmtime} attribute becomes zero.
+However, \gls{ee} \gls{os} does not update this attribute in compliance with
+the \gls{osek} specification \cite{erikaaltick}.  Hence, another technique is
+required to detect alarm events.
+
+\gls{ee} keeps track of all active alarms in a delta queue as shown in
+\autoref{fig:deltaqueue}.  There is one queue for each counter.  Whenever a
+counter is incremented the delta of the first element in the queue is
+decremented.  If the delta of the first alarm in the queue becomes zero this
+alarm and all following alarms with a delta of zero expire and the
+corresponding actions are executed.
+
+For an expiring alarm the \gls{os} is required to execute the corresponding
+action.  As shown in \autoref{tab:task_mapping_os} each alarm has an
+\lstinline{ActionType} attribute.  Via this attribute the \gls{os} determines
+the correct action for an alarm.  In other words, if an alarm expires this
+attribute must be read and a data read event is generated.  Consequently, a
+\gls{btf} stimulus event is created whenever the action type attribute of
+an alarm is read.  The exact action executed by an alarm, e.g.\ which task is
+activated for a process activation is read from the \gls{orti} file.
+
+\textbf{Event} actions must include the information about the affected event.
+For example, if a task sets an event it is necessary to know the target task
+and event for this action.  \gls{orti} allows it to detect when an event
+related service routine is executed however, no information about the event
+itself is made available.
+
+\begin{code}
+\begin{lstlisting}[caption={[Set event] Erika Enterprise uses the
+\lstinline{EE_th_event_active} array to keep track of the events set for each
+task.  If a new event is set the mask is updated by connecting the previous
+events and the new event via bitwise or.  It is not possible to set an event
+for a suspended task.},
+label={listing:set_event}]
+if ( EE_th_status[TaskID] == SUSPENDED ) {
+  ev = E_OS_STATE;
+} else {
+  /* Set the event mask only if the task is not suspended */
+  EE_th_event_active[TaskID] |= Mask;
+
+  /* Check if the TASK was waiting for an event we just set */
+  if ((EE_th_event_waitmask[TaskID] & Mask) != 0U)
+  {
+    /* Activate task here */
+  }
+}
+\end{lstlisting}
+\end{code}
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{r|l l}
+    Action           & Variable          & Additional Information \\
+    \hline
+    wait\_event      & \lstinline!EE_th_event_waitmask!    &  previous wait mask\\
+    clear\_event     & \lstinline!EE_th_event_active!      &  previous active mask\\
+    set\_event       & \lstinline!EE_th_event_active!      &  previous active mask\\
+    all actions      & -                                   &  event bit from eecfg.h \\
+  \end{tabular}
+  \caption[OS specific event mapping]{Erika Enterprise uses two arrays to keep
+  track of the event states for each task entity.  Via write events to these
+  arrays and the previous event state for a task instance correct \gls{btf}
+  events can be generated.}
+  \label{tab:os_event_mapping}
+\end{table}
+
+Erika Enterprise uses two arrays to keep track of the event related state of a
+task:  In \lstinline{EE_th_event_active} the events currently set for a
+specific task instance are stored and \lstinline{EE_th_event_waitmask} includes
+the information about which events a task entity is waiting for.  Each field in
+the array corresponds to one task and each bit of a field is related to a
+certain event.  Whenever a task is terminated both event masks are cleared.
+
+Using these arrays it is possible to create correct events as shown in
+\autoref{tab:os_event_mapping}.  Whenever an \gls{os} event related service
+routine is executed the corresponding event mask is updated.  For example, if
+an event is set for a specific task, the event mask is updated based on the new
+event.   This means the events which are currently set for a task and the new
+event are connected via the bitwise \emph{or} operation as shown in
+\autoref{listing:set_event}.
+
+Hence, a data write event to one of those arrays is created whenever a event
+service routine is executed.  However, only the new state of the bitmask
+becomes available.  To determine the event \gls{id} it is necessary to remember
+the previous state of the mask.  By executing a bitwise \emph{exclusive-or}
+operation on previous and current mask, the bit of the current event is
+computed.
+
+Unfortunately, this information is still not enough to create a valid \gls{btf}
+event.  For each bit it is necessary to know the corresponding entity name.
+\glsdesc{ee} defines the bitmask for each \gls{os} event in the \emph{eecfg.h}
+file which is created during the code generation process.  By parsing the event
+defines the mapping between bit and event name is retrieved.
+
+\begin{code}
+\begin{lstlisting}[caption={[Spin in for global resource request] In case a
+global resource (a resource used on multiple cores) is requested, Erika
+Enterprise uses a spinlock mechanism to lock the CPU until the resource
+becomes available.},
+label={listing:get_resource_spin}]
+/* if this is a global resource, lock the others CPUs */
+if (isGlobal) {
+  EE_hal_spin_in((EE_TYPESPIN)ResID);
+}
+\end{lstlisting}
+\end{code}
+
+\textbf{Resource} events or in \gls{btf} terms semaphore events, can be
+created based on the information provided by \gls{orti} as shown in
+\autoref{tab:resource_mapping}.  However, certain semaphore events like
+waiting can only occur in multi-core systems.  In a single-core system it is
+not possible that one task polls a resource that is already occupied because
+of the priority ceiling protocol.
+
+Erika Enterprise implements inter-core resource requests via spinlocks.  If a
+task requests a resource that is locked by a task on another core, the service
+routine does not return an error code but starts spinning as shown in
+\autoref{listing:get_resource_spin}.  As a consequence, the mapping for full,
+overfull, and waiting actions introduced in the previous section does not
+work.
+
+To solve around this problem, it is necessary to understand how spinlocks are
+implemented in Erika Enterprise.  The state of each spinlock is stored in the
+\lstinline{EE_hal_spin_status} array where each field corresponds to a separate
+spinlock.  A value of one indicates that the spinlock is locked otherwise the
+value is zero.  The \lstinline{EE_hal_spin_in} method is implemented via the
+atomic compare-and-swap operation.  This method is used to write a one into a
+certain spinlock field, but only if the spinlock is currently free.
+Compare-and-swap returns a value that indicates whether the operation was
+successful or not.  In the latter case the operation is executed again until it
+succeeds.
+
+Compare-and-swap operations result in a data access to the variable for which
+the operation is executed.  Therefore, it is possible to detect when a spinlock
+is polled based on data access events to \lstinline{EE_hal_spin_in}.  This
+information can then be used to create correct semaphore events as shown in
+\autoref{tab:os_semaphore_process_mapping}.
+
+Whenever the \emph{resource locker} attribute is read within the context of the
+\lstinline{GetResource} service routine, the corresponding resource entity must
+be stored in the system state.  If the resource is free, a write event to the
+\emph{resource locker} attribute follows and the corresponding \gls{btf} events
+can be created as described above.
+
+If there is no write event to the \emph{resource locker} attribute the
+resource is currently locked and the \gls{os} starts spinning which is
+detectable by continuous data access events to the field of
+\lstinline{EE_hal_spin_status} relating to the requested semaphore.
+Consequently, the running process is assigned to the semaphore via the waiting
+action and an overfull action must be created.  The process is now in polling
+mode.  Once there are no further accesses to \lstinline{EE_hal_spin_status}
+the request was successful, the task state changes to running and the resource
+state to full.
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{r l l}
+    Action      & Variable & Additional Information \\
+    \hline
+    waiting     & \lstinline!EE_hal_spin_status!  & running task, requested resource \\
+    full        & \lstinline!EE_hal_spin_status!  & requested resource \\
+    overfull    & \lstinline!EE_hal_spin_status!  & requested resource \\
+  \end{tabular}
+  \caption[OS specific semaphore event mapping]{Not all \gls{btf} semaphore
+  actions can be created based on \gls{orti} alone for an Erika Enterprise
+  multi-core application. This is because inter-core resource requests are
+  implemented via spinlocks.  Spinlock operations can be detected via the
+  \lstinline{EE_hal_spin_status} array.}
+  \label{tab:os_semaphore_process_mapping}
+\end{table}
+
+% }}}
@@ -0,0 +1,503 @@
+\section{OSEK/VDX OS}
+\label{section:osekvdxos}
+
+\Gls{osek} (\glsdesc*{osek}) \cite{osek} is an effort of the German and French
+automotive industry to establish common standards for the software architecture
+of distributed control units in vehicles.  Defining a common architecture for
+communication, operating systems, and network management avoids problems that
+arise otherwise by using different interfaces and protocols.  An abstraction
+layer between hardware and software allows \Gls{osek} compliant applications to
+be reused on different hardware platforms with minor modifications.
+
+\gls{osekos} specifies the architecture of a real-time operating system for
+single processors.  Based on the services offered by the \gls{os}, integration
+of modules from different manufactures is possible.  The \gls{os} meets the
+hard real-time requirements demanded by automotive applications.  \gls{osekos}
+can also be used in multi-core environments. In such cases a separate kernel is
+executed on each core.  Service routines can be used to interact between
+multiple \gls{os} instances.
+
+A high level of flexibility is required for an \gls{os} to support real-time
+systems on various target platforms.  In order to support low-end and high-end
+microcontrollers alike  \gls{osek} conformance classes (\glspl{osekcc}) are
+specified.  Depending on the \gls{osekcc} certain features, e.g.\ multiple task
+activations, multiple tasks per priority, and extended tasks are available or
+not.
+
+Dynamic creation of system objects like tasks, alarms or events is not
+supported by \gls{osekos}.  All objects are defined statically and created
+during the system generation phase \cite{osekos}.  Consequently, all \gls{os}
+entities are known before the system execution.
+
+\autoref{fig:os_module_abstraction} illustrates the abstraction of
+application modules from hardware resources.  Standardized system services
+offer functionality that can be used by all application modules.  Well-defined
+service calls, type definitions, and constants are specified and ensure the
+portability of an application to different architectures.
+
+An \glsdesc{io} (\gls{io}) module parallel to the \gls{os} gives access to
+microcontroller specific functionality like serial interfaces or
+analog-to-digital converters.  \gls{io} interfaces are not specified by
+\gls{osekos} which is opposing to the idea of easy portability.  \gls{osek}'s
+follow-up standard \gls{autosar} (\glsdesc{autosar}) \cite{autosar} solves this
+problem by adding a \gls{mcal} (\glsdesc{mcal}) to the \gls{autosaros}
+specification \cite{autosarbsw}.
+
+In 2003 \gls{autosar} was established by automobile \glspl{oem}, suppliers, and
+tool developers pursuing the same goals like \gls{osek}.  Different parts of
+the \gls{autosar} standard are based on \gls{osek} and \gls{autosaros}
+constitutes a superset of \gls{osekos}. Consequently, all features discussed
+here are also relevant for \gls{autosaros}.  Differences that are important in
+the context of this thesis are mentioned explicitly.
+
+\begin{figure}[]
+ \centering
+ \includegraphics[width=\textwidth]{./media/osek/os_module_abstraction.pdf}
+ \caption[\gls{osekos} architecture]{\gls{osek} compliant \glspl{os} abstract
+ application modules and hardware via an \gls{os} layer. A non standardized
+ \gls{io} module still results in hardware dependencies.}
+ \label{fig:os_module_abstraction}
+\end{figure}
+
+\subsection{OSEK Architecture}
+\label{subsection:osek_architecture}
+
+\gls{osek} provides a specification for the architecture of an embedded
+real-time \gls{os}.  One of the main purposes of the \gls{os} is to manage the
+available computational resources of the CPU\@.  Based on different factors
+such as priority, task group and scheduling policy, executable entities,
+so-called processes are given access to the processor core.  The procedure of
+deciding which entity is executed next is called scheduling.
+
+There are two types of process entities available: tasks and Interrupt Service
+Routines (\glspl{isr}).  Former are scheduled on task level while for latter
+the interrupt level is used.  Entities on interrupt level always have
+precedence over entities on task level.  Scheduling on interrupt level depends
+solely on the priority of an entity and is done by hardware.  For task entities
+scheduling is done by the \gls{os} and depends on priority, scheduling policy,
+and task group.
+
+\textbf{Tasks} are categorized into two types by \gls{osekos}.  A basic task
+has three states: ready, running, and suspended.  An extended task is a basic
+task with the additional waiting state.  Suspended tasks are passive and can be
+activated.  A task in the ready state can be allocated to the CPU for
+execution which is then indicated by the running state.  Only one task per
+core can be in the running state at a given point in time.  Extended tasks can
+wait passively for an event.  In that case they reside in waiting state.
+Waiting tasks are not allocated to the CPU.
+
+\begin{figure}[]
+ \centering
+ \includegraphics[width=0.7\textwidth]{./media/osek/extended_task_state_model.pdf}
+ \caption[\gls{osekos} task state model]{Task state model of an extended
+ \gls{osekos} task.  A basic task cannot enter the waiting state.}
+ \label{fig:extended_task_state_model}
+\end{figure}
+
+Different task state transitions are possible as shown in
+\autoref{fig:extended_task_state_model}.  At system initialization all tasks
+are suspended.  If a task has to be executed it must be activated by a system
+service.  A task can be started by the \gls{os} in order to be executed.  A
+task is preempted if a task of higher priority is scheduled.  Once a task has
+finished execution it terminates and switches to the suspended state.  Extended
+tasks can wait for system events and are released and switched to ready once
+the expected event is set.  The previous state of a ready task is not
+implicitly known.
+
+\begin{figure}[]
+ \centering
+ \includegraphics[width=0.9\textwidth]{./media/osek/isr_example.pdf}
+ \caption[\gls{isr} scheduling behavior]{\gls{isr} scheduling is done by
+ hardware and is solely depended on the interrupt priority.  \glspl{isr} do not
+ have a ready state because they are started by hardware.}
+ \label{fig:isr_example}
+\end{figure}
+
+Priorities are assigned to tasks and \glspl{isr} statically.  The lowest
+priority is zero and greater integers mean a higher priority.  If an \gls{isr}
+of priority zero is running and another \gls{isr} of priority one is activated,
+the first \gls{isr} is preempted and restarts once the second \gls{isr} is
+terminated as shown in \autoref{fig:isr_example}.  For tasks the same scenario
+is dependent on scheduling policy and task group.
+
+\begin{figure}[]
+ \centering
+ \includegraphics[width=\textwidth]{./media/osek/non_vs_full_preemptive_scheduling.pdf}
+ \caption[Non vs full preemptive scheduling]{Scheduling behavior of a non (top)
+ vs a full (bottom) preemptive task.  A non preemptive task finishes execution
+ even though a task with higher priority is in ready state.  Only for certain
+ system services, for example, an inter-process activation, the other task may
+ be scheduled.  A full preemptive task is preempted if a task with higher
+ priority is activated.  Once this task has terminated, the task with lower
+ priority can continue running.}
+ \label{fig:non_vs_full_preemptive_scheduling}
+\end{figure}
+
+\gls{osekos} specifies three \textbf{scheduling policies}: non, full, and mixed
+preemptive scheduling.  For a non preemptive tasks, rescheduling is only
+possible if a system routine that causes rescheduling, e.g.\ an inter-process
+activation or an explicit scheduler call is executed.  A full preemptive task
+can be rescheduled at any point in time during its execution if another task of
+higher precedence is activated as shown in
+\autoref{fig:non_vs_full_preemptive_scheduling}.  A mixed preemptive system
+contains tasks with both, non, and full preemptive scheduling policies.
+Otherwise the system is either non or full preemptive.
+
+\begin{figure}[]
+ \centering
+ \includegraphics[width=\textwidth]{./media/osek/task_group_example.pdf}
+ \caption[Scheduling of tasks in task groups]{Scheduling of task entities is
+ not only dependent on priority and scheduling priority.  \gls{osek} specifies
+ task groups, which change the priority of tasks inside in relation to tasks
+ outside a specific group.  In this example Task 3 has a greater priority than
+ Task 2.  However, because they are in the same group, Task 2 inherits the
+ priority of Task 1.  Thus, Task 2 is not preempted by Task 3.}
+ \label{fig:task_group_example}
+\end{figure}
+
+The precedence of a task is not necessarily due to its priority.  \gls{osekos}
+introduces the concept of task groups which allows it to group multiple tasks
+into a group.  A task which is not within a group has precedence over a task
+within a group only, if its priority is higher than the priority of the task
+with the highest priority within this group.  This means a task acts non
+preemptive towards another task if the task with the highest priority within
+the group has a greater priority than the other task as shown in
+\autoref{fig:task_group_example}.
+
+\textbf{\gls{osek} Conformance Classes} are used to adapt applications to
+different hardware capacities such as available memory and CPU speed.  Only one
+\gls{osekcc} can be active at a time and cannot be changed during runtime.
+Basic \glspl{osekcc} (BCC1 and BCC2) allow basic tasks only, while extended
+\glspl{osekcc} (ECC1 and ECC2) allow basic and extended tasks. Level one
+\glspl{osekcc} (BCC1 and ECC1) allow multiple tasks per priority and multiple
+activation requests per task.  For level two \glspl{osekcc} (BCC2 and ECC2)
+multiple tasks can share the same priority and the same task can be activated
+multiple times as shown in \autoref{tab:conformance_class}.  This means BCC2
+and ECC2 allow \glsdesc{mta} (\glspl{mta}).  An active task with pending
+activations becomes ready again immediately after termination.
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{r|c c c c}
+        &  BCC1 & BCC2 & ECC1 & ECC 2 \\
+    \hline
+    \gls{mta} &  no & yes & no & yes \\
+    Multiple tasks per priority & no & yes & no & yes \\
+    Extended tasks & no & no & yes & yes \\
+  \end{tabular}
+  \caption[\gls{osek} conformance classes]{\gls{osekos} specifies multiple
+  \glspl{osekcc} to respect the computational capacities of different
+  platforms.  Depending on the \gls{osekcc} different features are supported or
+  not.} \label{tab:conformance_class}
+\end{table}
+
+Task scheduling is done by the \gls{os} while \gls{isr} scheduling is done by
+hardware.  \glspl{isr} can be divided into category one and category two.
+Category one \glspl{isr} do not run under \gls{os} control and are thus not
+allowed to call \gls{os} services.  Category two \glspl{isr} are monitored by
+the \gls{os} and are allowed to execute a subset of the available \gls{os}
+services.  Tasks are always preempted by \glspl{isr} and can only continue
+running when all \glspl{isr} have terminated.
+
+Tasks and \glspl{isr} serve as containers for application specific functions.
+These functions are not managed by the \gls{os} and must be added to the
+process code by the user.  \gls{autosar} invented the concept of runnables to
+solve problems related to the \gls{vfb} (\glsdesc{vfb}) introduced by the
+\gls{autosar} architecture \cite{naumann2009autosar}.  A runnable is
+essentially the same as a function.
+
+\textbf{Events} are system objects that can be set or not.  Each event is owned
+by at least one extended task.  Only a task that owns an event is allowed to
+clear and to wait for it.  When waiting for an event a task switches into the
+waiting state.  It is switched back to ready when the corresponding event
+is set.
+
+All tasks and category two \glspl{isr} are allowed to set an event.  Events are
+used as a binary communication technique.  One task can signal another one for
+example, if a certain resource has been released.  Events are defined and
+assigned to tasks before runtime.  All events assigned to a task are cleared
+when this task is activated.
+
+\textbf{Resource} management is used to manage access to shared objects.  An
+\gls{osek} resource is basically a mutex.  Each resource gets a ceiling
+priority that is at least as high as the highest priority of all tasks that
+access this resource.  When a task accesses a resource and its priority is
+lower than the ceiling priority of this resource its priority is raised to the
+ceiling priority.  The priority is reset to the original value once the task
+releases the resource.
+
+This technique ensures that a task that potentially accesses a shared
+resource cannot switch into the running state.  This prevents priority
+inversion and deadlocks.  On the downside, tasks with a priority lower than the
+ceiling priority may be delayed by a lower priority task.
+
+\textbf{Alarms} are used to activate a task, set an event or execute an
+alarm-callback routine.  Each alarm has an alarmtime and a cycletime that is
+statically defined and measured in ticks.  An alarm expires the first time
+after alarmtime ticks and afterwards every cycletime ticks.  Thus, an alarm can
+be used to activate a task or set an event periodically.
+
+Each alarm is assigned to a counter object but each counter can be used by
+multiple alarms.  Counters are responsible for triggering an alarm after the
+specified number of ticks have passed.  Each \gls{osekos} offers at least one
+counter that is based on a hard- or software timer.
+
+\textbf{Hook routines} can be used to allow user-defined code within OS
+internal processing.  They cannot be preempted by tasks and \glspl{isr} and
+only a subset of the available \gls{os} services is available from their
+context.
+
+The StartupHook and ShutdownHook can be used to execute user specified code at
+system start and shutdown.  \gls{os} errors result in a call to the ErrorHook.
+It can be used to execute application specific error handling.  Finally,
+PreTaskHook and PostTaskHook are called at task start and termination.
+
+
+\subsection{OSEK OS Services}
+
+\gls{osekos} specifies system services that can be used to interact with
+internal \gls{os} mechanisms and objects like tasks or resources.  The internal
+presentation of system objects is implementation specific.  Only specified
+system services allow well-defined interaction with \gls{os} objects.  A system
+service may take zero or more input parameters and may return zero or more
+output parameters via call by reference.  The return value of an \gls{os}
+service is of type \lstinline{StatusType}.  \autoref{tab:status_types} shows
+defined status types.
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{r|l}
+    Define & Meaning\\
+    \hline
+    \lstinline!E_OK! & Service finished correctly. \\
+    \lstinline!E_OS_ACCESS! &  Calling task is not an extended task. \\
+    \lstinline!E_OS_CALLLEVEL! & Service called from invalid level. \\
+    \lstinline!E_OS_ID! & Invalid \gls{os} \gls{id}. \\
+    \lstinline!E_OS_LIMIT! & Number of activations is exceeded. \\
+    \lstinline!E_OS_NOFUNC! & Alarm or resource is not in use. \\
+    \lstinline!E_OS_RESOURCE! & A resource is still occupied. \\
+    \lstinline!E_OS_STATE! & Object is in invalid state. \\
+    \lstinline!E_OS_VALUE! & Value is not allowed. \\
+  \end{tabular}
+  \caption[\gls{osekos} error codes]{\gls{osekos} defines a
+  \lstinline{StatusType} type that can be used to return an error code from
+  service routines.  This table shows the status types that are defined by
+  \gls{osek} and their meaning. Users are free to define additional codes.}
+  \label{tab:status_types}
+\end{table}
+
+A task can be activated via alarm or \lstinline{ActivateTask} service routine.
+Latter is callable from interrupt and task level.  The task to be activated
+must be provided as an input parameter.  If this task is suspended its state
+will be changed to ready.  If it is not suspended the pending activations
+counter is incremented or \lstinline{E_OS_LIMIT} is returned if the \gls{mta}
+limit is exceeded.
+
+\lstinline{TerminateTask} is used to switch a task from running to suspended.
+All internal task resources are released and the service will not return if the
+call was successful.  \lstinline{TerminateTask} will fail with
+\lstinline{E_OS_RESOURCE} if resources are still occupied by a task.
+\lstinline{ChainTask} is a combination of \lstinline{ActivateTask} and
+\lstinline{TerminateTask}.  It terminates the current task and activates
+another task which is provided via input parameter.
+
+\begin{figure}[]
+ \centering
+ \includegraphics[width=\textwidth]{./media/osek/non_vs_full_schedule.pdf}
+ \caption[Explicit \gls{osekos} schedule call]{An explicit call to the
+ scheduler can solve the problem of a delayed higher priority task.}
+ \label{fig:non_vs_full_schedule}
+\end{figure}
+
+\lstinline{Schedule} can be called to explicitly trigger a scheduling decision.
+This makes sense for non preemptive tasks if a task with higher priority is
+ready.  Normally the task with higher priority is delayed until the task with
+low priority has finished execution as shown in
+\autoref{fig:non_vs_full_preemptive_scheduling}.  By calling
+\lstinline{Schedule} the non preemptive task is preempted and the task with
+higher priority is executed as illustrated in
+\autoref{fig:non_vs_full_schedule}.
+
+The routines \lstinline{GetResource} and \lstinline{ReleaseResource} can be
+used to request and release resources.  Nested resource requests are only
+allowed in last-in-first-out order, i.e.\ the resource that has been requested
+first must be released last.  Within a critical section that is protected via a
+resource no calls to services that cause rescheduling are allowed.  Both
+methods can be called from task and \gls{isr} level.  If a requested resource
+is already occupied \lstinline{E_OS_ACCESS} is returned.
+
+Interaction with event objects is done via \lstinline{SetEvent},
+\lstinline{ClearEvent}, \lstinline{GetEvent}, and \lstinline{WaitEvent} service
+routines.  \lstinline{SetEvent} takes a mask of events that should be set for a
+specific task.  Events can be deleted from the context of a process owning this
+event via \lstinline{ClearEvent}.  \lstinline{GetEvent} returns the current
+status of all events related to a specified task.  A task can wait for one ore
+more events using the \lstinline{WaitEvent} service routine.  Waiting lasts
+until at least on of the specified events is set.
+
+The service routine \lstinline{GetAlarmBase} returns the basic configuration of
+an alarm.  The remaining ticks until an alarm expires can be retrieved with
+\lstinline{GetAlarm}.  \lstinline{SetRelAlarm} increases the remaining ticks by
+the submitted value while \lstinline{SetAbsAlarm} sets them to an absolute
+value.  An alarm can be deactivated with \lstinline{CancelAlarm}.
+
+\subsection{OSEK OIL and ORTI}
+\label{subsection:osek_oil_and_orti}
+
+\begin{figure}[]
+ \centering
+ \includegraphics[width=\textwidth]{./media/osek/osek_code_generation.pdf}
+ \caption[\gls{osekos} build process]{An \gls{osek} application is compiled
+ from three sources.  The \gls{os} kernel, user created code and \gls{osekos}
+ object definition files which are created via code generation based on one or
+ more \gls{oil} files.} \label{fig:osek_code_generation}
+\end{figure}
+
+The implementation of system objects is not specified by \gls{osek}.
+Therefore, users cannot know how to create system objects because correct
+definition is depending on the \gls{os}.  \glsdesc{oil} (\gls{oil}) solves this
+problem by providing a meta language for defining system objects
+\cite{osekoil}.  Based on \gls{oil} configuration files code generators
+provided by the \gls{os} vendor can produce \gls{os} specific source code.  In
+combination with kernel and user code an application can be built as shown in
+\autoref{fig:osek_code_generation}.
+
+\gls{osek} specifies data types for all system object types.  However, the
+implementation of the data types is \gls{os} specific.  For example, a task is
+identified by \lstinline{TaskType}.  \lstinline{TaskType} could be implemented
+as an integer indexing a global list of task objects or as a pointer to the
+task object itself.
+
+Only a minimum amount of data types necessary to interact with service
+routines are specified.  Consequently, a lot of information is kept internally
+by the \gls{os} and is not available for the user.  For example, there is no
+common interface to get data of the pending activations of a task, the
+current state of a resource, or the state of an event.
+
+Application code that needs this information would need to access the \gls{os}
+internals directly which results in portability and security issues.  Moreover,
+external tools like debuggers that want to provide \gls{os} aware debug
+information have no standardized interface to relevant internal data.
+
+\glsdesc{orti} (\gls{orti}) was specified to solve this problem.  Via
+\gls{orti} tool vendors have a standardized interface to \gls{os} internal data
+and properties of relevant system objects.  The \glsdesc{koil} (\gls{koil})
+format is used to exchange relevant information via the \gls{orti} file.  This
+file contains mappings from \gls{os} object properties to variables that
+hold the respective information.
+
+\gls{orti} specifies a set of system properties that must be available for
+every \gls{osek} compliant \gls{os}. Operating system vendors are free to add
+additional information.  Each \gls{os} object is described in a separate
+section of the \gls{orti} file.  The specified sections that are relevant for
+this thesis are \emph{os}, \emph{task}, \emph{alarm}, and \emph{resource}.
+
+Information about the currently running process, the system error state, and
+the active service routine can be found in the \gls{os} section shown in
+\autoref{tab:os_attributes}.  The \emph{servicetrace} attribute is written
+whenever a service routine is started or finished along with the \gls{id} of
+the corresponding routine.  Task and \gls{isr} processes that are currently
+running in a system can be retrieved via \emph{runningtask} and
+\emph{runningisr2}.  The attribute \emph{lasterror} provides information about
+the last failure condition.
+
+As shown in \autoref{tab:os_task} the \gls{orti} task section makes the current
+\emph{priority}, \emph{state}, and number of open activations
+(\emph{currentactivations}) for each task available.  \autoref{listing:os_task}
+shows the textual representation of the \gls{orti} attributes for a single
+task.
+
+\begin{code}
+\begin{lstlisting}[caption={[\gls{orti} task example]Textual representation of
+the \gls{orti} attributes for a task entity.},
+label={listing:os_task}]
+TASK T_CylinderResponser {
+   priority = "osTcbActualPrio[30]";
+   state = "osTcbTaskState[30]";
+   currentactivations = "osTcbActivationCount[30]";
+};
+\end{lstlisting}
+\end{code}
+
+\autoref{tab:os_alarm} shows that alarms have an \emph{alarmtime} attribute
+that contains the ticks to the next expiry time.  A \emph{cycletime} is used
+for periodic alarms.  If a cyclic alarm expires, \emph{alarmtime} is reset to
+this value.  An alarm can be running or stopped which is indicated by the
+\emph{state} attribute and executes a certain \emph{action} if \emph{alarmtime}
+becomes zero.
+
+A resource can be locked or free which is indicated by the \emph{state}
+attribute.  In the former case the \emph{locker} attribute indicates the
+corresponding process as shown in \autoref{tab:os_resource}.  The resource
+\emph{priority} is also accessible.
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{r|l}
+    Attribute & Content \\
+    \hline
+    runningtask &  currently running task \\
+    runningisr2 &  currently running category 2 \gls{isr} \\
+    servictrace &  indicates entry and exit to service routines \\
+    lasterror & contains the last error code set by the system \\
+  \end{tabular}
+  \caption[\gls{orti} \gls{os} section]{The \gls{orti} \gls{os} section
+  provides information about the running task and category 2 \gls{isr}, entry
+  and exit to service routines and the last system error.}
+  \label{tab:os_attributes}
+\end{table}
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{r|l}
+    Attribute & Content \\
+    \hline
+    priority           & task priority \\
+    state              & task state (\autoref{fig:extended_task_state_model})\\
+    currentactivations & number of task activations \\
+  \end{tabular}
+  \caption[\gls{orti} task section]{The \gls{orti} task section provides
+  information about the current task priority, task state and number of
+  activations.  The task priority can be different to the statically defined
+  value because of the priority ceiling protocol.}
+  \label{tab:os_task}
+\end{table}
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{r|l}
+    Attribute & Content \\
+    \hline
+    alarmtime &  time till alarm expires \\
+    cycletime &  alarm cycle time of periodic alarms \\
+    state &  alarm state (running or stopped) \\
+    action &  action at alarm expiry time \\
+  \end{tabular}
+  \caption[\gls{orti} alarm section]{The \gls{orti} alarm section provides
+  information about the time that is left until an alarm expires, its cycle
+  time, the current state and the action that is executed once the alarm
+  expires.}
+  \label{tab:os_alarm}
+\end{table}
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{r|l}
+    Attribute & Content \\
+    \hline
+    state &  resource state (locked or unlocked) \\
+    locker &  the task that has locked a resource \\
+    priority &  resource priority \\
+  \end{tabular}
+  \caption[\gls{orti} resource section]{The \gls{orti} resource section
+  provides information about the state of a resource.  A resource can be locked
+  or not.  For a locked resource the corresponding task is made available.}
+  \label{tab:os_resource}
+\end{table}
+
+Additional sections and attributes can be found in the \gls{osek} \gls{orti}
+specification \cite{osekortib}.  Even via \gls{orti} not all \gls{os}
+internals become available.  Via \emph{servicetrace} it can be detected that a
+certain event is set or cleared but no information about the event itself is
+available.  Consequently, for certain use cases it may still be necessary to
+access \gls{os} specific data structures manually.
@@ -0,0 +1,593 @@
+\chapter{System Trace}
+\label{chapter:btf}
+
+In \autoref{section:trace_measurement} a trace has been defined as a
+sequence of chronological ordered events.  An event is a system state change an
+evaluator is interested in.  Various trace tools exist to record events.  They
+can be classified into hardware, hybrid and software tools.
+
+The same event can be represented on different levels as shown in
+\autoref{fig:trace_event_levels}.  An event within the respective level
+shall be called hardware, software or system event.  An event can be moved
+from one level into another via transformation.  For example a voltage level
+change in memory is a hardware event.  The corresponding software event could
+be a value change of a certain variable.
+
+Different representation levels do not relate to measurement tools with the
+same name.  A hardware event can be measured directly via hardware tracing.
+But it is also possible to detect the change of a variable via software
+tracing.  Based on the software event the occurrence of the respective hardware
+event can be deduced.  This means hardware event detection is not limited to
+hardware tracing and the same is true for software events.
+
+A well-defined format for events is required for further processing of recorded
+traces.  Tools that analyze or visualize a trace must be able to interpret the
+recorded data.  For example the hardware trace host software must be able to
+understand the hardware events that are generated by the on-chip trace device.
+Otherwise it is not possible to transform the hardware events into higher level
+software events.
+
+Depending on the measurement goal a different event level may be required:  A
+hardware designer is not interested in the timing behavior of an engine control
+unit, rather the correct functionality of a certain hardware register is of
+interest.  On the other hand, an application architect relies on the correct
+functionality of the hardware, but the timing behavior of an application on
+architecture level must be analyzable.
+
+\section{BTF}
+
+A system level trace can be used to analyze timing, performance and reliability
+of an embedded system.  \gls{btf} (\glsdesc{btf}) \cite{btf} was specified to
+support these use cases.  It assumes a signal processing system, where one
+entity influences another entity in the system.  This means an event does not
+only contain the information about which system state changed, but also the
+source for the change.  For example, a regular system event could be the
+activation of a task with the corresponding timestamp.  A \gls{btf} event
+additionally contains the information that the task activation was triggered by
+a certain alarm.
+
+A \gls{btf} event is defined as an octuple
+
+\begin{equation}
+  \label{eq:btf_trace}
+  b_{k} = (t_k,\, \Psi_k,\, \psi_k,\, \iota_k,\, T_k,\, \tau_k,\, \alpha_k,\,
+  \nu_k),\, k \in \mathbb{N},
+\end{equation}
+
+where each element represents a certain \gls{btf} field: $t_k$ is the
+\emph{timestamp}, $\Psi_k$ is the \emph{source}, $\psi_k$ is the \emph{source
+instance}, $\iota_k$ is the \emph{target type}, $T_k$ is the \emph{target},
+$\tau_k$ is the \emph{target instance}, $\alpha$ is the event \emph{action} and
+$\nu_k$ is an optional note.  A \gls{btf} trace can now be defined as
+
+\begin{equation}
+  B = \{b_k | t_{k} \leq t_{k+1} \wedge k \leq n\},\, n \in \mathbb{N},
+\end{equation}
+
+where $k$ is the index of a certain event and $n$ is the number of elements in
+the trace.
+
+The timestamp field is an integer value $t_k \in \mathbb{N}_{0}$.  All
+timestamps within the same trace must be specified relative to a certain point
+in time, which is usually the start of trace measurement.  System and trace
+start can occur at different points in time. Consequently, neither trace nor
+system start must occur at $t = 0$.  The time period between two events $b_{k}$
+and $b_{k+1}$ can be calculated as $\Delta t = t_{k+1} - t_{k}$.  If not
+specified otherwise, the time unit for $t_k$ is nanoseconds.
+
+A \gls{btf} event represents the notification of one entity by another.  There
+exist different entity types corresponding to the software and hardware objects
+of an application.  Each entity of a certain type has an unique name that must
+not be shared by entities of other types.  Certain entity types have a life
+cycle.  This means multiple instances of the same entity can occur within the
+same trace.  Instance counters are required to distinguish between different
+instances of the same entity.  This is important for multicore systems where an
+entity can run on two processor cores in parallel.
+\autoref{fig:entity_inheritance} depicts the relationship between entity type,
+entity and entity instance.
+
+\begin{figure}[]
+ \centering
+ \includegraphics[width=0.7\textwidth]{./media/btf/entity_inheritance.pdf}
+ \caption[\gls{btf} entity inheritance]{A \gls{btf} event represents the
+ impact of one entity by another.  Entities have different types and multiple
+ entities can exist for one type, identified by an unique name.  Entities that
+ have a life cycle, such as runnables, can be instantiated multiple times.  An
+ instance counter is required to distinguish multiple instantiations.}
+ \label{fig:entity_inheritance}
+\end{figure}
+
+A basic entity type is a runnable, which is essentially a simple function.  A
+system may contain of multiple runnable entities called \emph{Runnable\_1},
+\emph{Runnable\_2} and \emph{Runnable\_3}.  The life cycle of a runnable starts
+with the execution of this runnable and ends when it terminates.  A runnable
+can execute different actions such as calling another runnable or writing a
+variable.  In a multicore system the same runnable entity \emph{Runnable\_2}
+may be executed by two other runnables that are running in parallel on
+different cores.  If \emph{Runnable\_2} writes to a variable, it is not known
+from which core that write occurred.  With the information about which instance
+executed the write this problem is solved.
+
+The \emph{source} and \emph{target} fields are strings that represent entities
+which are part of the system.  The target entity is influenced by the source
+entity.  \emph{Source instance} and \emph{target instance} are positive integer
+values that identify the instance of the respective entity.  \emph{Target type}
+is the type of the target entity.  Types are represented by their corresponding
+type \gls{id}.  A source type field is not part of a \gls{btf} event even
+though it would make sense in certain cases.
+
+The \emph{action} field indicates the way in which way one entity is influenced
+by another.  Depending on the source and target entity types, different actions
+are possible and allowed by the specification.  The last field \emph{note} is
+optional even though it can be used to carry additional information for certain
+events.  \autoref{tab:btf_fields} summarizes the meaning of the different
+\gls{btf} fields.
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{r|l}
+    Field & Meaning \\
+    \hline
+    time            & Timestamp relative to a certain point in time.  \\
+    source          & Name of the entity that caused an event.  \\
+    source instance & Instance number of the entity that caused an event. \\
+    target type     & Type of the entity that is influenced by an event.  \\
+    target          & Name of the entity that is influenced by an event.  \\
+    target instance & Instance of the entity that is influenced by an event. \\
+    action          & The way in which target is influenced by source.  \\
+    note            & An optional field that is used for certain events. \\
+  \end{tabular}
+  \caption[\gls{btf} event fields]{A \gls{btf} event contains of eight fields.
+  An event describes the way in which one system entity is influenced by
+  another one.}
+  \label{tab:btf_fields}
+\end{table}
+
+A \gls{btf} trace can be persisted in a \gls{btf} trace file.  This file
+contains of two sections: a meta and a data section.  The meta section stands
+at the beginning of the file.  It contains information such as \gls{btf}
+version, creator of the trace file, creation date and time unit used by the
+time field.  Each meta attribute stands in a separate line in the form
+\lstinline{#<attribute name> <attribute definition>}.  The data section
+contains one \gls{btf} event per line in chronological order.  The first event
+comes at the beginning of the data section and the last event stands at the end
+of the file.  Comments are denoted by a \lstinline{#} followed by a space.
+\autoref{listing:btf_example} shows an example trace file.
+
+\begin{code}
+\begin{lstlisting}[caption={[An example \gls{btf} trace file]A \gls{btf} trace
+file contains of two sections. A meta section at the beginning of a file
+includes information such as creator, creation date and time unit.  It is
+followed by a data section that contains one event per line. Comments are
+denoted by a number sign followed by a space.},
+label={listing:btf_example}]
+#version 2.1.4
+#creator BTF-Writer (15.01.0.537)
+#creationDate 2015-02-18T14:18:20Z
+#timeScale ns
+     0,     Sim,  0, STI,       S_1MS, 0, trigger
+     0,   S_1MS,  0,   T,     T_1MS_0, 0, activate
+   100,  Core_0,  0,   T,     T_1MS_0, 0, start
+   100, T_1MS_1,  0,   R,  Runnable_0, 0, start
+ 25000, T_1MS_1,  0,   R,  Runnable_0, 0, terminate
+ 25100,  Core_1,  0,   T,     T_1MS_0, 0, terminate
+#    |        |   |    |            |  |  |
+# time   source   |    |       target  |  action
+#  source instance     | target instance
+#            target type
+#
+# Note that a number sign followed by a space denotes
+# a comment. Whitespaces in the data section are ignored.
+\end{lstlisting}
+\end{code}
+
+
+\section{BTF Entity Types}
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{c|c c }
+    Category & Entity Type & Type \gls{id} \\
+    \hline
+                  & Task      & T \\
+    Software      & \gls{isr} & I \\
+                  & Runnable  & R \\
+    \hline
+                  & Signal    & SIG   \\
+    \gls{os}      & Semaphore & SEM   \\
+                  & Event     & EVENT \\
+    \hline
+                  & Simulation & SIM  \\
+    Other         & Core       & Core \\
+                  & Stimulus   & STI  \\
+    \hline
+                  & Instruction Block       & IB \\
+                  & Electronic Control Unit & ECU \\
+    Not discussed & Processor               & Processor \\
+                  & Memory Module           & M \\
+                  & Scheduler               & SCHED \\
+  \end{tabular}
+  \caption[\gls{btf} entity types]{\gls{btf} entity types can be divided into
+  three categories: software, \gls{os} and other types.  Entity types are
+  represented by their type \gls{id}.  Some types are not relevant for this
+  thesis and are therefore not discussed.}
+  \label{tab:entity_overview}
+\end{table}
+
+\gls{btf} specifies the entity types that can be used for \gls{btf} events.
+Each entity type can be influenced by certain other types and vice versa.  The
+actions or in other words, the way in which one entity can be influenced by
+another, are also defined.  Different actions are possible for different entity
+types.  Entity types can be categorized into software, \gls{os} and other
+entity types.
+
+Not all entity types specified by \gls{btf} are discussed in detail as shown in
+\autoref{tab:entity_overview}.  The entity type instruction block (\emph{IB})
+represents a sub fraction of a runnable.  This concept is used by simulation
+but does not translate to a concept used by a real application.
+
+An electronic control unit (\emph{ECU}) consist of a at least one processor
+(\emph{Processor}).  This concept allows it to represent a system containing of
+multiple processors that communicate with each other.  The recording of a multi
+system aware hardware trace would required a measurement configuration with
+multiple trace tools that are synchronized to each other.  The design of such a
+setup was not in the scope of this thesis which is why ECU and processor
+entities are not discussed.
+
+Memory modules (\emph{M}) can be used to represent different memory sections of
+a CPU\@.  The \gls{btf} specification does not provide further information
+about memory modules.  Via hardware tracing the information about which memory
+sections are accessed by certain data events becomes available.  Since the
+specification does not provide further details about how to use memory modules,
+no further discussion is possible.
+
+The scheduler (\emph{SCHED}) entity type is used to represent actions executed
+by the \gls{os} that relate to the scheduling of task and process instances.
+Scheduler events become available implicitly via the respective process
+actions.
+
+
+\subsection{Software Entity Types}
+
+\gls{btf} distinguishes three kinds of software entity types: tasks,
+\glspl{isr} and runnables, with the respective type \glspl{id} \emph{T},
+\emph{I} and \emph{R}.  Tasks and \glspl{isr} are collected under the umbrella
+term process.  Accordingly, they share the same state and transition model as
+shown in \autoref{fig:process_state_chart}.
+
+\begin{figure}[]
+ \centering
+ \centerline{\includegraphics[width=1.3\textwidth]{./media/btf/process_state_chart.png}}
+ \caption[Process state figure]{\gls{btf} specifies more process states than
+ \gls{osek} (see \autoref{fig:extended_task_state_model}).  The additional
+ states polling and parking are required to represent active waiting.  Not
+ initialized and terminated indicate the beginning and end of a process
+ lifecycle.  The green boxes between the states show the name of the \gls{btf}
+ action for the respective transition.}
+ \label{fig:process_state_chart}
+\end{figure}
+
+\textbf{Process} instances start in the \emph{not initialized} state.  From
+there they can be \emph{activated} in order to switch into the \emph{active}
+state by a stimulus (\emph{STI}) entity.  All state transitions except
+\emph{activate} are executed by core (\emph{C}) entities.  An active process
+can be changed into the \emph{running} state by the core on which the process
+is scheduled.
+
+A running process can \emph{preempt}, \emph{terminate}, \emph{poll} and
+\emph{wait}.  Preemption occurs if another process is scheduled to be executed
+on the core.  In this case, the process can no longer be executed and changes
+into the \emph{ready} state.  A ready process \emph{resumes} running once the
+core becomes available again.  If a process has finished execution it
+terminates and switches into the \emph{terminated} state.  This finishes the
+lifecycle of a process instance.
+
+A process that \emph{polls} a resource switches into the active waiting state
+\emph{polling}.  A process that \emph{waits} for an event switches into the
+passive waiting state \emph{waiting}.  A \emph{waiting} process is
+\emph{released} into the ready state if one of the requested events becomes
+available.  If a polled resource becomes available, the task continues running
+which is indicated by the \emph{run} action.
+
+A polling process that is removed from the core is \emph{parked} and switched
+into the \emph{parking} state.  If the polled resource becomes available while
+the process is parking it is switched into the ready state.  This transition is
+called \emph{release\_parking}.  Otherwise the process continues polling, once
+it is reallocated to the core, which is called \emph{poll\_parking}.
+\autoref{tab:process_overview} summarizes process state transitions.
+
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{c c c c}
+     Current state & Next state & Action &  Source Entity Types\\
+    \hline
+    not initialized & active     & activate         & STI \\
+    active          & running    & start            & C \\
+    ready           & running    & resume           & C \\
+    running         & ready      & preempt          & C \\
+    running         & terminated & terminate        & C \\
+    running         & polling    & poll             & C \\
+    running         & waiting    & wait             & C \\
+    waiting         & ready      & release          & C \\
+    polling         & running    & run              & C \\
+    polling         & parking    & park             & C \\
+    parking         & ready      & release\_parking & C \\
+    parking         & polling    & poll\_parking    & C \\
+  \end{tabular}
+  \caption[Process state table]{Process entities can be in different states.  A
+  process instance starts in the not initialized state and finishes in the
+  terminate state.  Each state transition has an unique action name.  The
+  activate action can only be triggered by a stimulus entity.  All other
+  actions can only be triggered by a core entity.}
+  \label{tab:process_overview}
+\end{table}
+
+In addition to state transition actions, \gls{btf} specifies process
+notification actions.  This actions do not trigger a process state change, but
+indicate other events related to a process entity.  The \emph{mtalimitexceeded}
+action is triggered if more task instances than the allowed maximal value are
+activated.  If this happens, no new task instance is created.  Therefore, a
+notification event is necessary to make the event available in the trace.
+
+All other process notification actions are related to migration, the
+reallocation of a process from one core to another.  \gls{osekos} does not
+support process migration since a separate kernel is executed on each core.
+Thus migration notifications are not relevant for an \gls{osek} compliant
+\gls{os}.  Additionally migration actions become available implicitly via the
+respective process transition actions.  If a process instance is preempted on
+one core and resumed on another, the resume event will have a different source
+core than the preempt event.  Consequently, the related migration event is
+known.
+
+
+\textbf{Runnable} instances start in the not initialized state.
+Runnables can be \emph{started} by \glspl{isr} and tasks in order to switch
+into the \emph{running} state.  A runnable that \emph{terminates} switches into
+the \emph{terminated} stated and therefore finishes its lifecycle according to
+\autoref{tab:runnable_overview}.
+
+Since a runnable can only be executed from a process context it can not
+continue running if the respective process is preempted.  For this case the
+runnable must be \emph{suspended} and switches into \emph{suspended} state.
+Once the process resumes execution the runnable can also \emph{resume}.
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{c c c c}
+     Current state & Next state & Action &  Source Entity Types\\
+    \hline
+    not initialized & running    & start         & T, I \\
+    running         & terminated & terminate     & T, I \\
+    running         & suspended  & suspend       & T, I \\
+    suspended       & running    & resume        & T, I \\
+  \end{tabular}
+  \caption[Runnable state table]{All runnable actions can be triggered by task
+  and \gls{isr} entity types.  A runnable lifecycle starts when the runnable
+  first starts execution and ends when the runnable is terminated.  A runnable
+  is suspended and resumed depending on the process context in which it is
+  executed.}
+  \label{tab:runnable_overview}
+\end{table}
+
+\subsection{OS Entity Types}
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{c c c}
+    Action & Source Entity Types \\
+    \hline
+    read   & P \\
+    write  & P, STI \\
+  \end{tabular}
+  \caption[Signal actions]{Signals can be read or written.  For a write event
+  the new value is provided via the note field.}
+  \label{tab:signal_overview}
+\end{table}
+
+\gls{os} event types are categorized into signal, semaphore and event types.
+Signals are identified by \emph{SIG}, semaphores by \emph{SEM} and events by
+\emph{EVENT}.
+
+\textbf{Signal} entities represent variables that are relevant for the analysis
+of an application.  There are only two signal actions: \emph{read} and
+\emph{write} as shown in \autoref{tab:signal_overview}.  A signal can be read
+by a process entity.  This means that the value of a variable is retrieved from
+memory.  A signal entity does not have a lifecycle, thus the instance counter
+value for signals can remain constant.
+
+Write actions can be executed by process and stimulus entities.  A write action
+means that a new value is assigned to a variable.  If this assignment is done
+from process context, the respective process entity is the source for the write
+event.  Otherwise a stimulus entity can be used to represent the source, for
+example if a signal is changed by the \gls{os} or a hardware module.
+
+For signal writes, the \gls{btf} note field must be used to denote the value
+that was assigned to a variable,  usually represented by an integer value in
+decimal representation.  However, \gls{btf} does not specify in which form the
+value must be provided.  For read events the note field can optionally be used
+to indicate the value of the variable that was accessed.
+
+\textbf{Semaphores} can be used to control access to a common resource in
+parallel systems.  The basic idea is that a process can request a semaphore,
+before it enters a critical section, for example a section that contains access
+to shared variables.  If the semaphore is free, the request is accepted and the
+semaphore will be locked.  All requests to a locked semaphore fail, thus no
+other process can access the shared variables.  When the process leaves the
+critical section, it releases the semaphore, which then becomes free for other
+resources.
+
+There exist different types of semaphores.  A counting semaphore may be
+requested multiple times.  Every time a counting semaphore is requested, a
+counter is incremented and every time a counting semaphore is released, the
+same counter is decremented.  A counting semaphore is locked once the counter
+has reached a predefined value and the initial counter value is zero.
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{r l}
+    Action & Meaning \\
+    \hline
+    requestsemaphore   & Process requests a semaphore                        \\
+    exclusivesemaphore & Process requests a semaphore exclusively            \\
+    assigned           & Process is assigned as the owner of a semaphore     \\
+    waiting            & Process is assigned as waiting to a locked semaphore\\
+    released           & Assignment from process to semaphore is removed     \\
+    increment          & Semaphore counter is incremented                    \\
+    decrement          & Semaphore counter is decremented                    \\
+  \end{tabular}
+  \caption[Semaphore process actions]{Processes can interact with semaphores
+  in different ways.  If a process requests a semaphore successfully, it is
+  assigned to the semaphore and the counter is incremented, otherwise a waiting
+  event is triggered.  Once a semaphore is released, the assignment is removed
+  and the counter is decremented.}
+  \label{tab:semaphore_process}
+\end{table}
+
+A binary semaphore is a specialization of a counting semaphore for which the
+maximum counter value is one.  A mutex is a binary semaphore that supports an
+ownership concept.  This means a mutex knows all processes that may request it.
+This information allows the implementation of priority ceiling protocols in
+order to avoid deadlocks and priority inversion.  The \gls{osek} term for mutex
+is \emph{resource} as described in \autoref{subsection:osek_architecture}.
+
+\gls{btf} semaphore events can be used to represent the different semaphore
+types mentioned above.  Semaphore actions can be divided into two categories:
+Actions triggered by process instances as shown in
+\autoref{tab:semaphore_process} and actions executed by a semaphore entity
+itself.
+
+A process request to a semaphore is indicated by \emph{requestsemaphore}.  If a
+request is successful (the semaphore is not locked), the semaphore counter is
+\emph{incremented} and the process is \emph{assigned} to the semaphore.  The
+\emph{exclusivesemaphore} action represents a semaphore request that only
+succeeds, if the semaphore is currently not requested by any other process,
+i.e.\ the counter value is zero.  If a process fails to request a semaphore and
+switches into polling mode, in order to wait for this semaphore, this is
+indicated by the \emph{waiting} action.  A process that releases a semaphore
+\emph{decrements} the semaphore counter and the respective semaphore is
+\emph{released}, the process is no longer assigned to it.
+
+\begin{figure}[]
+ \centering
+ \centerline{\includegraphics[width=1.2\textwidth]{./media/btf/semaphore_state_chart.png}}
+ \caption[Semaphore states and actions]{Semaphore entities do not have a
+ lifecycle.  Nevertheless, they must be initialized before they are ready for
+ the first time.  A semaphore can be unlocked or locked.  A counting semaphore
+ can be requested multiple times in which cases it changes into the used state.
+ If there are no requests the semaphore is free.  A semaphore that has at least
+ as many requests as allowed if full and changes into the locked state.
+ Further requests in the locked stated result in an overfull action.}
+ \label{fig:semaphore_state_chart}
+\end{figure}
+
+Semaphores do not have a lifecycle, which is why their instant counter remains
+constant.  Nevertheless, a semaphore must be moved from the \emph{not
+initialized} to the \emph{free} state by the \emph{ready} action before it is
+requested for the first time as shown in figure
+\autoref{fig:semaphore_state_chart}.
+
+A free semaphore is not requested by any processes.  Once it is requested for
+the first time, the behavior is dependent on the semaphore type.  A mutex or
+binary semaphore is \emph{locked} and moved into the \emph{full} state.  A
+counting semaphored is changed into the \emph{used} state which is indicated by
+the \emph{used} action.  The used action is repeated for a counting semaphore
+for each further request and release of the semaphore, as long as the counter
+value stays greater than zero and smaller than the maximum value.  If the
+counter value of a used semaphore becomes zero this semaphore is \emph{freed}.
+If the maximum counter value is reached the semaphore state becomes \emph{full}
+which is indicated by the \emph{lock\_used} action.
+
+When a full binary semaphore or mutex is released, it is \emph{unlocked} and
+becomes free again, while a counting semaphore is changed back to the used
+state, indicated by the \emph{unlock\_full} action.  A request to a full
+semaphore entity results in an \emph{overfull} action and the state is changed
+to \emph{overfull}.  The overfull state indicates that there is at least one
+process polling a semaphore.  Each additional request also results in an
+overfull action.  Once there are no more processes waiting for a semaphore,
+this semaphore becomes full again which is indicated by the \emph{full} action.
+\autoref{tab:semaphore_semaphore} summarizes semaphore states and their
+meaning.
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{r|l}
+    State        & Meaning \\
+    \hline
+    not initialized & Semaphore is not ready     \\
+    free            & No process is assigned to the semaphore \\
+    used            & At least one process is assigned \\
+    full            & Maximum number of processes are assigned   \\
+    overfull        & Semaphore is full and there are further requests \\
+  \end{tabular}
+  \caption[Semaphore state overview]{Semaphores can be in five different
+  states.  Before a semaphore entity can be used, it must be moved into the
+  free state.  No process is assigned to a free semaphore.  A counting
+  semaphore that has not yet reached its maximum request value, is in used
+  state. Once no further requests are accepted, a semaphore is full.  A full
+  semaphore that is requested by another process is said to be overfull.}
+  \label{tab:semaphore_semaphore}
+\end{table}
+
+\textbf{Events} are objects used for inter process communication, provided by
+the \gls{os}.  One process can use an event to notify another one, for example
+when a computation finishes or a resource becomes available.  Consequently, the
+source entity for an event action must be a task or \gls{isr}.  Event entities
+do not have a lifecycle, therefore, no instance counter value is required.
+
+There exist three event actions: \emph{wait\_event}, \emph{clear\_event} and
+\emph{set\_event}.  A process, that waits for an event, changes into passive
+waiting mode, until the respective event is set.  An event can be set by
+another process.  For the \emph{set\_event} action it is necessary to specify
+for which process entity the respective event is set.  This information is
+provided via the \gls{btf} note field.  An event can be cleared by the process
+for which the event was set.
+
+
+\subsection{Other Entity Types}
+
+There are three other entity types: simulation, core and stimulus entities.
+The type \gls{id} for simulation is \emph{SIM}, for core \emph{C} and for
+stimulus \emph{STI}.
+
+\textbf{Stimul} are used to depict application behavior that cannot be
+represented by other entity types.  The only stimulus action is \emph{trigger}.
+A stimulus can be triggered by process and simulation entities,  Once a
+stimulus is triggered, it can be used for the actual event, i.e.\ the
+activation of a task instance.  Multiple stimulus instances can exist in a
+system at a certain point in time.  Thus the instance counter field must be
+used for stimulus events.
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{r|l}
+    Action      & Meaning \\
+    \hline
+    finalize    & Initialization of system environment completed \\
+    error       & An error record during trace recording \\
+    tag         & Transmit meta information about the source entity \\
+    description & Provide a description for the source entity   \\
+  \end{tabular}
+  \caption[Simulation actions]{The simulation entity can be used to provide
+  meta information about the trace environment.  Simulation is misleading since
+  it must also be used in a \gls{btf} trace recorded on hardware.  That is why
+  the term \emph{system} is more appropriate.  The system entity can be used to
+  trigger stimulus instances.  For tag and description events the note field
+  is used to provide meta information.}
+  \label{tab:simulation_actions}
+\end{table}
+
+\textbf{Simulation} entities are used to provide meta information in a
+simulated \gls{btf} trace as shown in \autoref{tab:simulation_actions}.
+Nevertheless, it can, and must, also be used in a hardware trace.  For example,
+a stimulus entity that activates a task can be triggered by a process or
+simulation entity.  In the first case, the resulting \gls{btf} events represent
+an inter-process activation.  However, in the case that a process gets
+activated by an alarm, process is not the correct source type and simulation
+must be used instead.  Since a simulation entity does not make too much sense
+in a hardware trace, \emph{system} is a more appropriate term to denote the
+concept represented by the simulation entity type.
+
+\textbf{Core} entities are used to provide an execution context for process
+entities.  Only one process can be allocated to a core at the same time.  Core
+entities do not have a lifecycle.
@@ -0,0 +1,495 @@
+\section{System Trace}
+\label{chapter:btf}
+
+A trace is defined as a sequence of events.  Events depict a change in the
+state of a system and can be represented on different levels of abstraction.
+These are discussed in more detail in \autoref{section:trace_measurement}.
+For the timing analysis of embedded multi-core real-time systems a trace on
+system level is required. 
+
+Tools that analyze or visualize traces must be able to interpret the recorded
+events.  For example, the software that interacts with hardware trace devices
+must be able to understand the hardware events that are generated on-chip.
+Otherwise it is not possible to transform the hardware events into higher level
+software events.  For that reason a well-defined format for events is required
+for further processing of recorded traces.
+
+Depending on the goal pursued with a trace measurement, one level of
+abstraction can be more appropriate than another.  On the one hand, a software
+engineer who implements a feedback control system is mainly interested in the
+functions and variables that correspond to that particular task.  A system
+engineer on the other hand, who integrates a variety of different modules into
+a single application, is not interested in the details of each individual
+module.  Instead the functionality of the system as a whole is of interest.
+
+
+\subsection{BTF Specification}
+
+A trace on system level can be used to analyze timing, performance, and
+reliability of an embedded system.  \glsdesc{btf} (\gls{btf}) \cite{btf} was
+specified to support these use cases.  It assumes a signal processing system
+where one entity influences another entity in the system.  This means an event
+does not only contain which system state changes but also the source of that
+change.  For example, an observed event on system level could be the activation
+of a task with the corresponding timestamp.  Then a \gls{btf} event
+additionally contains the information that the task activation was triggered by
+a certain alarm.
+
+Let $k$ be an index in $\mathbb{N}_{0}$ denoting an individual event
+occurrence then a \gls{btf} event can be defined as an octuple
+
+\begin{equation}
+  \label{eq:btf_trace}
+  b_{k} = (t_k,\, \Psi_k,\, \psi_k,\, \iota_k,\, T_k,\, \tau_k,\, \alpha_k,\, \nu_k)
+\end{equation}
+
+where each element maps to a \gls{btf} field: $t_k$ is the \emph{timestamp},
+$\Psi_k$ is the \emph{source}, $\psi_k$ is the \emph{source instance},
+$\iota_k$ is the \emph{target type}, $T_k$ is the \emph{target}, $\tau_k$ is
+the \emph{target instance}, $\alpha$ is the event \emph{action} and $\nu_k$ is
+an optional \emph{note}.
+
+A \gls{btf} trace can then be defined as a sequence of \gls{btf} events where
+$n \in \mathbb{N}_{0}$ is the number of events in the trace:
+
+\begin{equation}
+  B = (b_1, b_2, \dots, b_n)
+\end{equation}
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{r|l}
+    Field & Meaning \\
+    \hline
+    time            $(t)$ & Timestamp relative to a certain point in time.  \\
+    source          $(\Psi)$ & Entity that caused an event.  \\
+    source instance $(\psi)$ & Entity instance that caused an event. \\
+    target type     $(\iota)$ & Type of the entity that is influenced by an event.  \\
+    target          $(T)$ & Entity that is influenced by an event.  \\
+    target instance $(\tau)$ & Entity instance that is influenced by an event. \\
+    action          $(\alpha)$ & The way in which target is influenced by source.  \\
+    note            $(\nu)$ & An optional field that is used for certain events. \\
+  \end{tabular}
+  \caption[\gls{btf} event fields]{A \gls{btf} event consists of eight fields.
+  An event describes the way in which one system entity is influenced by
+  another one.}
+  \label{tab:btf_fields}
+\end{table}
+
+A \gls{btf} event can be represented textually as a comma-separated list where
+each field maps to an element as shown in the following listing.
+
+\vspace{1cm}
+\begin{lstlisting}
+12891, TASK_200MS, 3, SIG, EngineSpeed, 0, write, 42
+\end{lstlisting}
+\vspace{1cm}
+
+The first field (\lstinline{12891}) represents the timestamp of the event.  A
+\gls{btf} trace contains the chronological order of events that occurred in a
+system.  Therefore, for each timestamp $t_k \in \mathbb{N}_{0}$ in a trace it
+holds that $t_{k} \leq t_{k+1}$.  All timestamps within the same trace must be
+specified relative to a certain point in time, that can be chosen arbitrarily.
+Hence, neither trace nor system start must occur at $t_0 = 0$.  The time period
+between two events $b_{k}$ and $b_{k+1}$ can be calculated as $\Delta t =
+t_{k+1} - t_{k}$.  If not specified otherwise, the unit for time is
+nanoseconds.
+
+A \gls{btf} event represents the notification of one entity by another.  Each
+entity has an unique name.  In the previous example, the source entity $\Psi$
+has the name \lstinline{TASK_200MS} and the target entity $T$ is called
+\lstinline{EngineSpeed}.
+
+The fourth field \lstinline{SIG} is the short representation of the target
+entity type $\iota$.  \autoref{tab:entity_overview} gives an overview of all
+entity types and their corresponding short \glspl{id}.  Entity types are
+discussed in more detail in \autoref{subsection:btf_entity_types}.  In this
+example, the target entity \lstinline{EngineSpeed} is a signal.  The source
+entity type is not part of a \gls{btf} event.
+
+Some entities, tasks, \glspl{isr}, runnables, and stimuli have a lifecycle.
+This means at a certain point in time an entity becomes active in the system
+and eventually it leaves the system.  For example, the lifecycle of a task
+starts with its activation and ends when it terminates.  If \glspl{mta} are
+allowed for an application, it is possible that multiple \emph{instances} of a
+task are active at the same time.  For those cases where multiple instances
+of an entity are currently active, it is consequently not clear to which
+instance of the entity the event refers.
+
+Instance counter fields $\psi$ and $\tau$ are used to distinguish between
+multiple instances of the same entity.  The counters are integer values $\psi,
+\tau \in \mathbb{N}_{0}$ that are incremented for each new entity becoming
+active in the system.  The first instance of an entity gets the counter value
+$0$.  \lstinline{TASK_200MS} has an instance counter value of \lstinline{3}
+which means the event refers to the fourth instance of this entity.  For
+entities that do not have a lifecycle like signals, the counter field is not
+relevant and $0$ can be used as a placeholder value.
+
+The seventh field $\alpha$ represents the way in which the target entity is
+influenced by the source entity.  In this example \lstinline{TASK_200MS}
+writes a new value to the signal entity \lstinline{EngineSpeed}.  Depending on
+source and target entity type, different actions are allowed by the
+specification as discussed in \autoref{subsection:btf_actions}.
+
+For signal write events the note field $\nu$ is used to denote the value that
+is written to the signal in this case \lstinline{42}.  The note field is only
+required for certain events.  \autoref{tab:btf_fields} summarizes the meaning
+of the different \gls{btf} fields.
+
+A \gls{btf} trace can be persisted in a \gls{btf} trace file.  This file
+contains two parts: a meta and a data section.  The meta section is written at
+the beginning of the file.  It contains general information on the trace such
+as \gls{btf} version, creator of the trace file, creation date, and time unit
+used by the time field.  Each meta attribute uses a separate line, starting
+with a \lstinline{#}, followed by the attribute name, a space, and the
+attribute definition.
+
+\begin{code}
+\begin{lstlisting}[caption={[An example \gls{btf} trace file]A \gls{btf} trace
+file contains of two sections. A meta section at the beginning of a file
+includes information such as creator, creation date and time unit.  It is
+followed by a data section that contains one event per line. Comments are
+denoted by a number sign followed by a space.},
+label={listing:btf_example}]
+#version 2.1.4
+#creator BTF-Writer (15.01.0.537)
+#creationDate 2015-02-18T14:18:20Z
+#timeScale ns
+     0,     Sim,  0, STI,       S_1MS, 0, trigger
+     0,   S_1MS,  0,   T,     T_1MS_0, 0, activate
+   100,  Core_0,  0,   T,     T_1MS_0, 0, start
+   100, T_1MS_1,  0,   R,  Runnable_0, 0, start
+ 25000, T_1MS_1,  0,   R,  Runnable_0, 0, terminate
+ 25100,  Core_1,  0,   T,     T_1MS_0, 0, terminate
+\end{lstlisting}
+\end{code}
+
+In the data section one \gls{btf} event is written per line in chronological
+order.  The first event of a trace is located directly after the meta section
+and the last event at the end of the file.  Comments are denoted by a
+\lstinline{#} followed by a space.  \autoref{listing:btf_example} shows an
+example trace file.
+
+\subsection{BTF Entity Types}
+\label{subsection:btf_entity_types}
+
+As shown in \autoref{tab:entity_overview} \gls{btf} specifies fourteen entity
+types that can be classified into five categories: environment, software,
+hardware, operating system, and information.  Some entity types are not
+relevant for this thesis and therefore only discussed briefly.  The actions
+or in other words the way in which one entity can be influenced by another
+are defined for each entity type as discussed in
+\autoref{subsection:btf_actions}.  Actions for types that are classified as not
+relevant are not considered.
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{c|c c c}
+    Category      & Entity Type & Type \gls{id} & Relevant \\
+    \hline
+    Environment   & Stimulus                & STI       & X  \\
+    \hline
+                  & Task                    & T         & X  \\
+    Software      & \gls{isr}               & I         & X  \\
+                  & Runnable                & R         & X  \\
+                  & Instruction Block       & IB        &    \\
+    \hline
+                  & Electronic Control Unit & ECU       &    \\
+    Hardware      & Processor               & Processor &    \\
+                  & Core                    & C         & X  \\
+                  & Memory Module           & M         &    \\
+    \hline
+                     & Scheduler            & SCHED     &    \\
+    Operating System & Signal               & SIG       & X  \\
+                     & Semaphore            & SEM       & X  \\
+                     & Event                & EVENT     & X  \\
+    \hline
+    Information      & Simulation           & SIM       & X  \\
+  \end{tabular}
+  \caption[\gls{btf} entity types]{\gls{btf} entity types can be divided into
+  five categories.  Types that are relevant in the context of this thesis are
+  marked by an X.}
+  \label{tab:entity_overview}
+\end{table}
+
+\textbf{Environment} contains only the stimulus entity type.  Stimuli are used
+to depict application behavior that cannot be represented by other entity
+types.  A stimulus can be used to activate a task or \gls{isr} and to set a
+signal value.  Multiple stimulus instances can exist in a system at a certain
+point in time.  Thus, the instance counter field is required for stimulus
+entities.
+
+\textbf{Software} contains the task, \gls{isr}, runnable, and instruction block
+types.  Tasks and \glspl{isr} summarized by the term process are containers
+for application software and discussed in \autoref{section:osekvdxos}.
+
+Runnable is a term established by \gls{autosar} and relates to the concept of C
+type functions.  A runnable can be executed from the context of processes and
+contains application specific functionality.  Multiple runnables can be active
+in a system at the same time for example, if the same runnable is executed by
+two different tasks allocated to distinct cores.  Hence, an instance counter is
+required for runnable entities.
+
+Instruction blocks are used to represent execution time within the context of
+runnables.  Since these execution times become available implicitly via the
+corresponding runnable events, the addition of instruction blocks to a
+\gls{btf} trace is optional and does not provide any immediate benefits.
+
+\textbf{Hardware} contains the electronic control unit (ECU), processor, core,
+and memory module types.  An ECU consists of one or more processors.  This
+allows it to represent a multi-processor system.  Generally, tracing only
+supports the recording of a single processor.  Multi-processor setups require a
+way to synchronize the measurement between multiple trace measurement tools.
+The design of such a setup is not in the scope of this thesis.
+
+A processor is composed of one or more cores and recording multiple cores on
+the same chip is feasible via tracing.  Cores are necessary to map software and
+\gls{os} events to the corresponding hardware entities.  Since this information
+is important for the analysis of embedded systems, cores are relevant for this
+thesis.
+
+Memory modules model different memory sections on a chip.  They allow it to
+represent memory related processes on the CPU such as access times to variables
+or cache misses.  According to Helm \cite{christianmaster}, direct measurement
+of memory access times is not possible.  Instead, dedicated code must be added
+to the application in order to determine the execution times for different
+memory access operations.  Due to the intrusiveness of this approach it is not
+feasible for real applications.  Therefore, memory modules are not supported in
+this thesis.
+
+\textbf{Operating System} covers scheduler, signal, semaphore, and event
+entity types.  The scheduler entity type is used to represent actions executed
+by the \gls{os} that relate to the scheduling of process instances.  Scheduler
+events become available implicitly via the respective process actions and are
+thus not considered in this thesis.
+
+Signals represent access to variables that are relevant for the analysis of an
+application.  Consequently, signal events must be added to a \gls{btf} trace
+that is recorded from hardware.
+
+Semaphores entities are used to control access to common resources in parallel
+systems.  A process can request a semaphore before it enters a critical
+section, e.g.\ a section that contains an access to a memory region that is
+vulnerable to race conditions.  If the semaphore is free the request is
+accepted, the semaphore is locked and all subsequent requests fail.  Once the
+process has left the critical section it releases the semaphore.
+
+Events are objects for inter-process communication provided by the \gls{os}.
+One process can use an event to notify another one for example, when a
+computation finishes or a resource becomes available. Event entities do not
+have a lifecycle therefore, no instance counter value is required.
+
+\textbf{Information} contains only the simulation entity type.  This entity
+type has two purposes.  Firstly, it can be used to provide information about
+errors that occurred during trace recording.  Secondly, it is required to
+trigger stimulus events.  Since stimulus events are mandatory to represent task
+activations by non process objects, the simulation entity must be considered in
+the context of this thesis.  Because \emph{simulation} does not make sense in a
+trace recorded from hardware \emph{system} can be used as a more appropriate
+term.
+
+\subsection{BTF Actions}
+\label{subsection:btf_actions}
+
+\gls{btf} specifies different actions.  The available actions are dependent on
+the source and target entity types of the respective event.
+
+\textbf{Stimuli} only support the \emph{trigger} action.  A stimulus can be
+triggered by process and simulation entities.  Once a stimulus is triggered it
+can be used for the actual event: the activation of a task or \gls{isr} or to
+set the value of a signal.
+
+\begin{figure}[]
+ \centering
+ \centerline{\includegraphics[width=\textwidth]{./media/btf/process_state_chart.png}}
+ \caption[Process state figure]{\gls{btf} \cite{btf} specifies more process
+ states than \gls{osek} (compare \autoref{fig:extended_task_state_model}).  The
+ additional states polling and parking are required to represent active
+ waiting.  Not initialized and terminated indicate the beginning and end of a
+ process lifecycle.  The green boxes between the states show the name of the
+ \gls{btf} action for the respective transition.}
+ \label{fig:process_state_chart}
+\end{figure}
+
+\textbf{Process} entities support the actions shown in
+\autoref{fig:process_state_chart}.  A process instance starts in the \emph{not
+initialized} state.  From there it can be \emph{activated} in order to switch
+into the \emph{active} state by a stimulus entity.  All state transitions
+except \emph{activate} are executed by core entities.  An active process is
+changed into the \emph{running} state as soon as it is scheduled by the
+\gls{os}.
+
+A running process can \emph{preempt}, \emph{terminate}, \emph{poll}, and
+\emph{wait}.  Preemption occurs if another process is scheduled to be executed
+on the core.  In this case, the current process changes into the \emph{ready}
+state.  A ready process \emph{resumes} running once the core becomes available
+again.  If a process finishes execution it \emph{terminates} and switches into
+the \emph{terminated} state.  This finishes the lifecycle of a process
+instance.
+
+A process that \emph{polls} a resource switches into the active waiting state
+\emph{polling}.  If the resource becomes available, the process continues
+running which is indicated by the \emph{run} action.  A process that
+\emph{waits} for an event switches into the passive waiting state
+\emph{waiting}.  A \emph{waiting} process is \emph{released} into the ready
+state if one of the requested events becomes available.  
+
+A polling process that is removed from the core is \emph{parked} and switched
+into the \emph{parking} state.  If the resource becomes available while the
+process is parking it is switched into the ready state.  This transition is
+called \emph{release\_parking}.  Otherwise the process continues polling, once
+it is reallocated to the core which is called \emph{poll\_parking}.
+
+In addition to state transition actions, \gls{btf} specifies process
+notification actions.  These actions do not trigger a process state change but
+indicate other events related to a process entity.  The \emph{mtalimitexceeded}
+action is triggered if more process instances than allowed are activated in
+parallel.  If this happens, no new task instance is created.  Therefore, a
+notification event is necessary to make the event visible in the trace.
+
+All other process notification actions are related to migration the
+reallocation of a process from one core to another.  \gls{osekos} does not
+support process migration since a separate kernel is executed on each core.
+Thus migration notifications are not relevant for an \gls{osek} compliant
+\gls{os}.  Additionally, migration actions become available implicitly via the
+respective process transition actions.  If a process instance is preempted on
+one core and resumed on another, the resume event has a different source core
+than the preempt event.  Consequently, the related migration event is known.
+
+\begin{figure}[]
+ \centering
+ \centerline{\includegraphics[width=0.8\textwidth]{./media/btf/runnable_state_chart.png}}
+ \caption[Runnable state figure]{\gls{btf} runnable states and state
+ transitions \cite{btf}.}
+ \label{fig:runnable_state_chart}
+\end{figure}
+
+\textbf{Runnable} instances start in the \emph{not initialized} state as shown
+in \autoref{fig:runnable_state_chart}.  Runnables can be \emph{started} by
+\glspl{isr} and tasks in order to switch into the \emph{running} state.  A
+runnable instance that \emph{terminates} switches into the \emph{terminated}
+stated and therefore finishes its lifecycle.
+
+Because a runnable can only be executed from process context, it can not
+continue running if the respective process is preempted.  In this case the
+runnable must be \emph{suspended}.  Once the process resumes execution the
+runnable can also \emph{resume}.
+
+\textbf{Core} entities are used to provide an execution context for process
+entities and cannot be used as a target entity themselves.  Consequently, no
+\gls{btf} core actions are specified. Only one process can be allocated to a
+core at the same time and core entities do not have a lifecycle.
+
+\textbf{Signal} entities can be influenced by two actions: \emph{read} and
+\emph{write}.  A signal can be read within the context of a process entity.
+This means that the value of a variable is retrieved from memory.  A signal
+entity does not have a lifecycle thus, the instance counter value for signals
+can remain constant.
+
+Write actions can be executed by process and stimulus entities.  They indicate
+that a new value is assigned to a variable.  If this assignment is done from
+process context, the respective process entity is the source for the write
+event.  Otherwise, a stimulus entity can be used to represent the source for
+example, if a signal is changed by the \gls{os} or a hardware module.
+
+For signal writes, the note field must denote the value that was assigned to a
+variable.  For read events the note field can optionally indicate the value of
+the variable that was accessed.
+
+\textbf{Semaphores} can be categorized into different types.  Counting
+se\-ma\-phores can be requested multiple times.  They have an initial counter
+value of zero.  For every request, this counter is incremented and every time
+it is released the value is decremented.  A counting semaphore is locked once
+the counter has reached a predefined value.
+
+A binary semaphore is a specialization of a counting semaphore for which the
+maximum counter value is one.  A mutex is a binary semaphore that supports an
+ownership concept.  This means a mutex knows all processes that may request it.
+This information allows the implementation of priority ceiling protocols in
+order to avoid deadlocks and priority inversion.  The \gls{osek} term for mutex
+is \emph{resource}, resources are discussed in
+\autoref{subsection:osek_architecture}.
+
+\gls{btf} semaphore events can represent all mentioned semaphore types.
+Semaphore actions can be divided into two categories: actions triggered by
+process instances as shown in \autoref{tab:semaphore_process} and actions
+executed by a semaphore entity itself as shown in
+\autoref{fig:semaphore_state_chart}.
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{r l}
+    Action & Meaning \\
+    \hline
+    requestsemaphore   & Process requests a semaphore.                       \\
+    exclusivesemaphore & Process requests a semaphore exclusively.           \\
+    assigned           & Process is assigned as the owner of a semaphore.    \\
+    waiting            & Process is assigned as waiting to a locked semaphore.\\
+    released           & Assignment from process to semaphore is removed.    \\
+    increment          & Semaphore counter is incremented.                   \\
+    decrement          & Semaphore counter is decremented.                   \\
+  \end{tabular}
+  \caption[Semaphore process actions]{Processes can interact with semaphores in
+  different ways.  If a process requests a semaphore successfully, it is
+  \emph{assigned} to the semaphore and the counter is \emph{incremented},
+  otherwise a \emph{waiting} event is triggered.  Once a semaphore is
+  \emph{released}, the assignment is removed and the counter is
+  \emph{decremented}.}
+  \label{tab:semaphore_process}
+\end{table}
+
+\begin{figure}[]
+ \centering
+ \centerline{\includegraphics[width=\textwidth]{./media/btf/semaphore_state_chart.png}}
+ \caption[Semaphore states and actions]{\gls{btf} \cite{btf} semaphore entities
+ do not have a lifecycle.  Nevertheless, they must be \emph{initialized} before
+ they are ready for the first time.  A semaphore can be \emph{unlocked} or
+ \emph{locked}.  A counting semaphore can be requested multiple times in which
+ case it changes into the \emph{used} state.  If there are no requests the
+ semaphore is \emph{free}.  A semaphore that has at least as many requests as
+ allowed is \emph{full} and changes into the \emph{locked} state.  Further
+ requests in the locked stated result in an \emph{overfull} action.}
+ \label{fig:semaphore_state_chart}
+\end{figure}
+
+A process request to a semaphore is indicated by \emph{requestsemaphore}.  If a
+request is successful the semaphore counter is \emph{incremented} and the
+process is \emph{assigned} to the semaphore.  The \emph{exclusivesemaphore}
+action represents a semaphore request that only succeeds, if the semaphore is
+currently not requested by any other process, i.e. the counter value is zero.
+If a process fails to request a semaphore and switches into polling mode,
+indicated by the \emph{waiting} action.  A process that releases a semaphore
+\emph{decrements} the semaphore counter and the respective semaphore is
+\emph{released}, the process is no longer assigned to it.
+
+Semaphores do not have a lifecycle which is why their instant counter remains
+constant.  Nevertheless, a semaphore must be moved from the \emph{not
+initialized} to the \emph{free} state by the \emph{ready} action before it is
+requested for the first time.
+
+A free semaphore is not requested by any process.  At the first request the
+behavior is dependent on the semaphore type.  A mutex or binary semaphore is
+\emph{locked} and moved into the \emph{full} state.  A counting semaphored is
+changed into the \emph{used} state which is indicated by the \emph{used}
+action.  The used action is repeated for a counting semaphore for each further
+request or release as long as the counter value stays greater than zero and
+smaller than the maximum value.  If the counter value of a used semaphore
+becomes zero this semaphore is \emph{freed}.  If the maximum counter value is
+reached the semaphore state becomes \emph{full} which is indicated by the
+\emph{lock\_used} action.
+
+When a full binary semaphore or mutex is released, it is \emph{unlocked} and
+becomes free again, while a counting semaphore is changed back to the used
+state, indicated by the \emph{unlock\_full} action.  A request to a full
+semaphore entity results in an \emph{overfull} action and the state is changed
+to \emph{overfull}.  The overfull state indicates that there is at least one
+process polling a semaphore.  Each additional request also results in an
+overfull action.  Once there are no more processes waiting for a semaphore,
+this semaphore becomes \emph{full} again.
+
+\textbf{Events} can be influenced by three different actions.  If a process
+starts waiting for an event, this is indicated by the \emph{wait\_event}
+action.  Another process can set an event via the \emph{set\_event} action.
+For this action it is necessary to provide the entity for which the event is
+set via the \gls{btf} note field.  An event can be cleared by the process for
+which the event was set which is indicated by \emph{clear\_event}.
@@ -0,0 +1,450 @@
+\section{Evaluation Test Bench}
+\label{section:evaluation_test_bench}
+
+To make the results of the validation process comprehensible and reproducible
+for others it is important to document the hardware and software setup, the
+configuration of all tools in use, as well as the ways in which the traces are
+compared.
+
+\subsection{Software Setup}
+
+\textbf{Simulation} is used to validate the \gls{btf} traces obtained from
+hardware via tracing and transformation.  It allows analyzing of embedded
+real-time systems by generating an event trace.  A simulation is easy to
+configure and executable without hardware.  This is an advantage in
+the early design stages of an application when the final target platform is
+not yet defined.
+
+Advanced simulation tools allow it to take platform dependent timing behavior
+into account.  It is possible to select the \gls{os} and processor platform in
+use.  Therefore, more accurate simulation results can be achieved.  For
+example, memory access times \cite{christianmaster} and timing overheads caused
+by \gls{os} service routines \cite{maxmaster} can be taken into consideration.
+
+\glsdesc{ta} provides the simulation software used for validation
+\cite{tasimulator}.  The \gls{ta} Simulator is based on a discrete-event system
+simulation approach \cite{cassandras2008introduction, banks2000dm}.  It has
+already been used successfully in research projects to evaluate scheduling
+algorithms in multi-core systems \cite{deubzer2011robust}, to examine
+synchronization protocols \cite{alfranseder2013modified}, and to validate
+optimization algorithms for embedded applications \cite{stefanmaster}.  In this
+thesis version {15.02.1} of the \gls{ta} Simulator is in use.
+
+\begin{figure}[]
+ \centering
+ \begin{tikzpicture}[]
+   \tikzstyle{level 1}=[sibling distance=30mm]
+   \tikzstyle{level 2}=[sibling distance=40mm]
+   \node{RTE Model}
+       child {node {Hardware Model}}
+       child {node {OS Model}}
+       child {node {Software Model}}
+   ;
+ \end{tikzpicture}
+ \caption[RTE model parts]{A \gls{rte} model consists of a hardware, an
+ \gls{os} and a software part.}
+ \label{fig:rte_model}
+\end{figure}
+
+\begin{figure}[]
+ \centering
+ \begin{tikzpicture}[]
+   \tikzstyle{level 1}=[sibling distance=20mm]
+   \tikzstyle{level 2}=[sibling distance=20mm]
+   \node{Software Model}
+       child {node {Processes}}
+       child {node {Runnables}}
+       child {node {Signals}}
+       child {node {OS Events}}
+       child {node {Stimuli}}
+   ;
+ \end{tikzpicture}
+ \caption[Software model parts]{The software model represents the entities of
+ an application that are executed by the \gls{os} and the hardware.}
+ \label{fig:software_model}
+\end{figure}
+
+\textbf{Timing Models} describe the architecture and timing of an embedded
+system.  Model based development is a software development paradigm where the
+design of an application is created in form of a timing model.  This can be
+done before the actual application software is implemented.  Based on the
+timing model requirements and constraints can be specified and validated via
+simulation.
+
+Timing models can provide different levels of granularity depending on the use
+case.  \gls{ta} uses the \glsdesc{rte} (\gls{rte}) model format which consists
+of three parts as shown in \autoref{fig:rte_model}.
+
+The hardware model includes the processor with all cores, quartzes, and memory
+modules.  Quartzes are used as a clock source for cores and memory modules.
+Memory modules can be connected with each other and to the processor cores to
+represent the architecture of the real chip.  Vendor specific hardware models
+are available for certain processor families for example, the Infineon Aurix
+and the Freescale Matterhorn.
+
+The \gls{os} model defines the scheduling policy for an application as well as
+\gls{os} related timing overheads.  Implementation of service routines varies
+depending on the \gls{os} vendor.  Consequently, the timing overhead resulting
+from this routines is also different which makes it necessary to take their
+runtime into account in order to get more accurate simulation results.  Vendor
+specific \gls{os} models are available for certain \glspl{os} for example,
+Elektrobit Autocore OS \cite{autocore}.
+
+\begin{figure}[]
+ \centering
+ \begin{tikzpicture}[]
+   \tikzstyle{level 1}=[sibling distance=20mm]
+   \tikzstyle{level 2}=[sibling distance=20mm]
+   \tikzstyle{level 3}=[sibling distance=30mm]
+
+   \node{Application}
+     child {node {Task}}
+     child {node {Task}
+       child {node {Runnable}}
+       child {node {Runnable}
+         child {node {Instructionblock}}
+         child {node {Signal Read}}
+         child {node {Instructionblock}}
+         child {node {Signal Write}}
+       }
+       child {node {\lstinline{ActivateTask}}}
+       child {node {\lstinline{TerminateTask}}}
+     }
+     child {node {Task}}
+   ;
+ \end{tikzpicture}
+ \caption[Software model hierarchy]{The software model allows it to represent
+ the runtime behavior of an application.  All relevant software entities are
+ part of the system model and stand in relation to each other.  For example, a
+ task can call a runnable which itself writes a signal value and runs for a
+ certain amount of processor instructions which is represented by an
+ instruction block.}
+ \label{fig:software_hierarchy}
+\end{figure}
+
+The software model represents how hardware and \gls{os} are used by an
+application.  Hardware and \gls{os} model remain the same for all tests and
+only the software part is changed depending on the different test scenarios.
+\autoref{fig:software_model} depicts the system entities that are part of the
+software model.
+
+Processes and runnables are ordered in a hierarchical structure as shown in
+\autoref{fig:software_hierarchy}.  Processes can call system routines and
+runnables, while runnables can access signals, request, and release semaphores
+and execute instruction blocks.  The latter represents a certain number of clock
+cycles required to execute a code section.  It is required to mimic the runtime
+behavior of a real application.  The number of instructions taken by an
+instruction block can be configured to be static or to vary depending on a
+specific distribution, e.g., Weibull distribution.
+
+Stimuli are used to activate process entities.  Similar to alarms they can
+activate processes periodically or only once.  Additionally, it is possible to
+trigger stimuli to represent more complex activation patterns for example,
+arrival curves.  Since runtime and activation patterns based on random
+distributions are tough to represent in C code instruction blocks and stimuli
+with constant values are used for the test models.
+
+\textbf{Code Generation} is used to create C code based on the timing model of
+an application.  A template based model export was specified and implemented in
+the context of this thesis.  The solution is already in production and allows
+it to create C code and the corresponding \gls{oil} files automatically.
+
+The idea is to iterate over all software entities and create the appropriate
+code dependent on the entity type.  Transformation of most model entities is
+straightforward.  Runnable calls map to function calls in C.  A signal read
+access occurs if one signal is assigned to another variable. Accordingly, a
+write access is represented by assigning a value to a signal.  Task, event, and
+semaphore actions are created based on the respective \gls{osek} service
+routines discussed in \autoref{section:osekvdxos}.
+
+An instruction block is the only software model entity that cannot be mapped to
+C code straightforwardly.  As discussed before, an instruction block represents
+a certain amount of clock cycles required to execute a code section.  Normally,
+this value is set based on measurement results or empirical values from other
+applications.  For code generation it is necessary to create code whose
+execution takes the same amount of clock cycles as specified in the model.
+
+\begin{code}
+\begin{lstlisting}[caption={[Instructionblock]
+The function takes the specified amount of clock cycles to be executed.
+This code is dependent on hardware and compiler in use and must
+therefore be adapted to other platforms.},
+label={listing:instructionblock}]
+void executeInstructionsConst(int clockCycles) {
+    int i;
+    clockCycles /= 2;
+    for (i = 0; i < clockCycles; i++) {
+        __asm("nop");
+        __asm("nop");
+    }
+}
+\end{lstlisting}
+\end{code}
+
+The obvious way to do so is a for loop however, the exact code is dependent on
+compiler and hardware.  \autoref{listing:instructionblock} shows the code
+necessary to get the desired behavior for the hardware used in this thesis.  It
+works because the Infineon Aurix processor family features zero overhead loops.
+This means a for loop with one \lstinline{nop} instruction takes exactly one
+clock cycle because loop condition check, loop incrementation, and loop content
+are executed in parallel.
+
+It is important to add multiple \lstinline{nop} instructions per loop cycle.
+The Aurix trace device implements a compressed program flow trace. This means
+trace messages are only created for certain function events.  Since the
+\lstinline{loop} assembly instructions is one of the commands that cause the
+creation of a trace message, a loop with a single \lstinline{nop} would cause
+the trace buffer to overflow if the value of \lstinline{clockCycles} exceeds a
+certain value.  By adding additional \lstinline{nop} commands less trace
+messages are created per time unit and the function events can be transmitted
+off-chip without overflowing.
+
+\textbf{\glsdesc{ee}} is an \gls{osek} compliant real-time operating system.
+It is free of charge and open-source which makes it an excellent choice for
+this thesis.  Without access to the \gls{os} internal code creation of many
+\gls{btf} events would not have been feasible.  The \gls{ee} software packet
+contains the complete \gls{os} source code as well as RT-Druid, the code
+generation tool to create \gls{os} specific source code from the \gls{oil}
+file.  In this thesis the \glsdesc{ee} and RT-Druid 2.4.0 release is used.
+
+\begin{code}
+\begin{lstlisting}[caption={[\gls{ee} \gls{oil} config] Subset of the \gls{ee}
+\gls{oil} \gls{os} attributes used for validation.  Attributes that are not
+mentioned are set to the default value described in the \gls{ee} RT-Druid
+reference manual.},
+label={listing:oilconfig}]
+EE_OPT = "EE_EXECUTE_FROM_RAM";
+EE_OPT = "EE_ICACHE_ENABLED";
+EE_OPT = "EE_DCACHE_ENABLED";
+REMOTENOTIFICATION = USE_RPC;
+CFLAGS = "-O2";
+STATUS = EXTENDED;
+ORTI_SECTIONS = ALL;
+KERNEL_TYPE = ECC2;
+COMPILER_TYPE = GNU;
+\end{lstlisting}
+\end{code}
+
+\autoref{listing:oilconfig} shows the \gls{oil} attributes set for validation.
+All attributes that are not mentioned take their default value as documented by
+the RT-Druid reference manual \cite{rtdruidref}.  The test applications are
+executed from RAM, instruction and data caching is enabled, and the
+\lstinline{O2} optimization level is configured.  Inter-core communication is
+implemented via remote procedure calls.  All \gls{orti} attributes and extended
+error codes are logged by the \gls{os}.  The configuration is created in a way
+that allows maximum traceability combined with decent performance.
+Consequently, a similar configuration could also be used in a production
+system.
+
+The \textbf{Hightec Compiler} \cite{hightec} is used to compile the C code
+generated by code generation and RT-Druid.  It is based on GCC and \gls{ee}
+generates appropriate makefiles automatically if \lstinline{GNU} is set as
+compiler.  For the tests Hightec Compiler v4.6.5.0 is used.
+
+\textbf{TRACE32} \cite{trace32} is used as the hardware trace host software.
+Configuration of this part of the test setup is the most complex.  Different
+vendor specific properties, like the number of processor observation blocks,
+must be taken into consideration to create a setup that produces optimal
+results.  The used hardware and the corresponding configuration is discussed in
+the next section.
+
+\subsection{Hardware Setup}
+\label{subsection:hardware_setup}
+
+An \textbf{Infineon TriBoard TC298} evaluation board assembled with the
+Infineon \textbf{SAK-TC298TF-128} microcontroller is used for evaluation.  This
+board provides an \glsdesc{imds} together with an \glsdesc{agbt}.  According to
+\autoref{tab:trace_tool_overview} and \autoref{tab:interfaces} this setup
+allows for optimal trace performance.
+
+\begin{code}
+\begin{lstlisting}[caption={[\gls{ee} ECU config] \gls{ee} ECU config to
+support the Infineon TC27x microcontroller family and the TC2X5 evaluation
+board.  Source code changes are necessary to support the hardware used in this
+thesis.},
+label={listing:ecu_config}]
+MCU_DATA = TRICORE {
+  MODEL = TC27x;
+};
+BOARD_DATA = TRIBOARD_TC2X5;
+\end{lstlisting}
+\end{code}
+
+\gls{ee} provides support for the Infineon TC27x processor family which can be
+activated in the \gls{oil} file as shown in \autoref{listing:ecu_config}.
+TC27x and TC29x are quite similar. Nevertheless, it is important to adapt the
+configuration to the TC298TF processor.  This is done by changing the includes
+in \lstinline{./cpu/tricore/inc/ee_tc_cpu.h} from
+\lstinline{<tc27xa/Ifx_reg.h>} to \lstinline{<tc29xa/Ifx_reg.h>}.  The layout
+of the evaluation board is the same.
+
+Based on \lstinline{MCU_DATA} \gls{ee} configures the controller in the correct
+way during system initialization.  The \gls{oil} \lstinline{CPU_CLOCK}
+attribute can be used to set the desired CPU frequency.  The configuration done
+by \gls{ee} is sufficient to put the controller into a usable state.  However,
+there are problems regarding the frequency of the Multi-Core Debug System
+($f_{mcds}$).  The TC298TF can run at a frequency up to \unit[300]{MHz}.
+\gls{ee} does not configure the MCDS clock divisor at all and consequently
+$f_{mcds}$ is equal to the system frequency.  However, the TC29xA user manual
+states that the maximum allowed value for $f_{mcds}$ is \unit[160]{MHz}
+\cite{tc29xa}.
+
+\begin{figure}[]
+ \centering
+ \includegraphics[width=\textwidth]{./media/eval/clocks.png}
+ \caption[Evaluation clock configuration]{Correct clock settings are essential
+ to record valid hardware traces for the Infineon TC298TF microcontroller. The
+ multi-core debug system frequency must be lower or equal to \unit[160]{MHz}
+ and the ratio between CPU and MCDS frequency must be $1:1$.}
+ \label{fig:clocks}
+\end{figure}
+
+
+Incorrect clock configuration may result in data and function events being
+dropped randomly.  According to the manual it is necessary to set the
+$f_{system}$ to $f_{mcds}$ ratio to $2:1$ to avoid this problem.  Despite using
+the proclaimed configuration event dropping still occurred during the
+validation.  After consultation with the hardware experts from Lauterbach GmbH
+it turned out that a ratio of $1:1$ between system and MCDS clock is the only
+way to guarantee the reception of all trace events.  Thus, the \gls{ee} clock
+configuration must not be changed, but the system frequency must be smaller or
+equal to \unit[160]{MHz}.  \autoref{fig:clocks} shows a configuration with a
+system frequency of \unit[100]{MHz} as used in this thesis.
+
+The \textbf{PowerTrace II} by Lauterbach is used for trace recording.  \gls{ee}
+creates so called Lauterbach PRACTICE Scripts \cite{cmmref} also called cmm
+scripts during the compilation process.  These scripts can be used to operate
+the TRACE32 software automatically.  The generated scripts by \gls{ee} are
+inadequate for the requirements in this thesis.  Thus, it is necessary to
+improve the scripts in a way that allows continues data and function trace as
+shown in \autoref{listing:trace32_config}.
+
+\begin{code}
+\begin{lstlisting}[caption={[TRACE32 config]
+Script to configure TRACE32 and the on-chip trace device.  The setup allows for
+continues function and data trace.},
+label={listing:trace32_config}]
+SYStem.CPU TC298TF
+trace.method.analyzer
+trace.mode.stream
+
+mcds.source.cpumux0.program on
+mcds.source.cpumux0.readaddr on
+mcds.source.cpumux0.writeaddr on
+mcds.source.cpumux0.writedata on
+
+break.set symbol.begin(foo)--symbol.end(bar) /r /w /tracedata
+
+Go
+wait 1.s
+break
+
+printer.filetype csv
+printer.file data.csv
+winprint.trace.findall , cycle readwrite /list %timefixed \
+                       ti.zero varsymbol cycle data
+
+trace.export.csvfunc func.csv
+\end{lstlisting}
+\end{code}
+
+Firstly, it is necessary to select the correct CPU (line 1).  This is
+important because otherwise the TRACE32 trace decoder is not able to interpret
+the hardware trace events in the correct way.  The trace method
+\lstinline{analyzer} is required for real-time tracing and trace mode
+\lstinline{stream} means that the trace data is sent to the host computer in a
+continuous way (lines 2 and 3).
+
+Next, the processor and bus observation blocks are configured to detect all
+function and data events (lines 5-9).  This is done via the multi-core debug
+system.  Setting the \lstinline{program} attribute to \lstinline{on} activates
+the function trace.  The other three attributes are necessary to record all
+data events.
+
+A complete data trace may still overexert the bandwidth of the setup.  Via
+\lstinline{break.set} filters as described in
+\autoref{section:trace_measurement} can be created (line 10).  The trace device
+is configured to record data read and write events for all variables in the
+memory range defined by \lstinline{symbol.begin(foo)--symbol.end(bar)}.  Here
+\lstinline{foo} is a variable that has a lower address than the variable
+\lstinline{bar}.  Using the configuration described in this section, the
+compiler allocates the array \lstinline{EE_as_rpc_services_table} at the
+beginning of the \gls{os} memory section and \lstinline{EE_th_status} at the
+end.  So those two variables provide a convenient boundary to detect all
+\gls{os} data events of interest.
+
+Trace recording is started via the \lstinline{Go} command (lines 12-14).  The
+\lstinline{wait} command waits for an eligible amount of time and recording is
+stopped by the \lstinline{break} command.
+
+Now the data and function traces can be exported (lines 16-21).  For the data
+export it is first necessary to configure the desired output file type
+(\lstinline{csv}) and output filename (\lstinline{data.cvs}).  Via the
+\lstinline{winprint} command the data export process is started and
+\lstinline{trace.export.csvfunc} exports the function trace.
+
+TRACE32 creates multiple graphical user interfaces one for each core of the
+target platform.  Accordingly, the export commands must be executed for each
+core or in other words for each GUI\@.  The resulting files
+\lstinline{data.csv} and \lstinline{func.csv} contain one event per line.  The
+following listing shows a data event.
+
+\begin{lstlisting}
+-0083448136,0.0004372600,"EE_ORTI_servicetrace","wr-data",43
+\end{lstlisting}
+
+A Lauterbach data event consists of five comma separated fields.  In
+\autoref{eq:data_event} the elements of a data event are defined.  The second
+field is the timestamp $t_i$ in seconds, the third field is the name of the
+accessed variable $\pi_i$, the fourth field specifies in which way $a_i$ the
+variable is accessed (a data write in this case), and the fifth field contains
+the value of the data access event $v_i$.  Since one trace data trace file is
+exported per core, the core name $c_i$ is the same for all events from one
+file.  Accordingly, the next listing shows a Lauterbach function event
+consisting of three fields.
+
+\begin{lstlisting}
+437050; EE_as_StartCore; fentry
+\end{lstlisting}
+
+In \autoref{eq:function_event} the elements of a function event are defined.
+Analogous to data events, the core name $c_j$ is the same for all events within
+a file.  The first field maps to the timestamp $t_j$, the second field is the
+name of the function $\pi_j$ that is affected by the event, and the third field
+indicates the way $\theta_j$ in which the function is affected.
+
+
+\subsection{Validation Techniques}
+\label{subsection:validation_techniques}
+
+Traces can differ in two ways.  A temporal difference exits for two traces
+$B^1$ and $B^2$ with the same length $n$ if there is at least one event pair
+with the index $i \in (1,2,\dots,n)$ for which $t^1_i \neq t^2_i$.  As
+discussed before, the \gls{ta} Simulator is capable of taking hardware and
+\gls{os} specific behavior into account.  Nevertheless, simulating a trace for
+which all timestamps are equal to the corresponding hardware trace is not
+feasible by definition \cite{balci1995principles}.
+
+This problem is bypassed in two steps.  At first the general accuracy of the
+trace setup is validated by tracing events whose timing characteristics are
+precisely known in advance.  Secondly, for the actual test models, a
+plausibility test based on certain metrics such as task activate-to-active and
+task response time is conducted.
+
+The second way in which two traces can differ is called semantic difference.
+It exists for two traces $B^1$ and $B^2$ with the same length $n$ if there is
+an event pair with the index $i \in (1,2,\dots,n)$ for that at least one of the
+following cases is true: source or target entity differ ($\Psi^1_i \neq
+\Psi^2_i \vee T^1_i \neq T^2_i$), source or target instance differ ($\psi^1_i
+\neq \psi^2_i \vee \tau^1_i \neq \tau^2_i$), target type differs ($\iota^1_i
+\neq \iota^2_i$), event action differs ($\alpha^1_i \neq \alpha^2_i$), or note
+differs ($\nu^1_i \neq \nu^2_i$).
+
+If two traces $B^1$ and $B^2$ have a different length $|B^1| \neq |B^2|$ they
+also differ semantically.  Assuming the trace and simulation setup is correct
+a difference in length can have two reasons:  either the trace times differ or
+one trace includes entities that do not occur in the other trace.  In the
+former case, the disparity can be fixed by removing the events at the end of
+the longer trace until both traces have the same length.  In the latter case,
+events for entities that are not contained in both traces may be removed in
+order to achieve semantic equality.
@@ -0,0 +1,629 @@
+\section{Test Cases}
+
+As discussed in the previous section traces can differ in a temporal and in a
+semantic way.  To exclude the appearance of temporal discrepancies due to a
+wrong trace setup, the timing accuracy is tested based on code with known
+event-to-event durations.  Next, the semantic correctness of the trace mapping
+is validated based on manually created test models.  Finally, randomized models
+are generated in order to detect semantic errors that may not be detected by
+the manually created models due to selection bias \cite{geddes1990cases}.
+
+
+\subsection{Timing Precision}
+
+In \autoref{listing:instructionblock} code to execute a fixed number of
+instructions is introduced.  This code is now used to evaluate the timing
+precision of the trace setup.  According to
+\autoref{subsection:hardware_tracing} the setup should allow for cycle accurate
+trace measurement.
+
+The Infineon Aurix processor family provides performance counters
+\cite{tc29xa}.  Once started, these counters are incremented based on the CPU
+core frequency. A frequency of \unit[100]{MHz} is used for the validation,
+consequently an increment occurs every \unit[10]{ns}.  The counter can be
+started at an arbitrary point in time for example, at program start.  By
+reading the counter value at the beginning and at the end of a critical
+section the clock cycles that expired between these two points can be
+determined.
+
+\begin{code}
+\begin{lstlisting}[caption={[Trace setup accuracy validation]
+Code to validate the timing precision of the trace setup.},
+label={listing:accuracy_validation}]
+EE_UINT32 i;
+EE_UINT32 ccntStart;
+EE_UINT32 ccntEnd;
+EE_UINT32 n = N / 4;
+
+__asm("nop");
+ccntStart = EE_tc_get_CCNT();
+__asm("nop");
+for (i = 0; i < n; i++) {%
+  __asm("nop");
+  __asm("nop");
+  __asm("nop");
+  __asm("nop");
+}
+__asm("nop");
+ccntEnd = EE_tc_get_CCNT();
+\end{lstlisting}
+\end{code}
+
+\autoref{listing:accuracy_validation} shows the code that is used to check the
+timing precision.  \gls{ee} provides the API function
+\lstinline{EE_tc_get_CCNT} to read out the performance counter register.  As
+described above, the performance counters are read out before and after the
+critical section.
+
+The critical section is guarded with two additional \lstinline{nop} assembly
+instruction to avoid compiler optimization.  Additionally, the generated
+assembly code was examined manually to verify that no unwanted instructions
+were added by the compiler.  A for loop is used to execute a predefined number
+of instructions.  The number of repetitions is depended on the define
+\lstinline{N} which should be a multiple of four.
+
+The code is now executed for different values of \lstinline{N}.  For each event
+the expected number of clock cycles $c_e$, the actual number of clock cycles
+$c_a$, the expected time difference $t_e$ in nanoseconds, and the actual time
+difference $t_a$ in nanoseconds between the writes to \lstinline{ccntStart} and
+\lstinline{ccntEnd} are listed in \autoref{tab:precision_validation}.
+
+The expected number of clock cycles is calculated by $c_e = N + 2$.  The value
+two is added because of the additional \lstinline{nop} instructions.  The
+expected time is calculated by $t_e = c_e * \frac{1}{f}$ where $f$ is the
+processor frequency.
+
+The actual number of clock cycles is calculated by $c_a = ccntEnd - ccntStart$.
+The actual time is calculated by $t_a = t_j - t_i$ where $j$ is the index of
+the write event to \lstinline{ccntEnd} and $i$ is the index of the write event
+to \lstinline{ccntStart}.
+
+Four different values for \lstinline{N}, $128$, $1024$, $4096$, and $65536$
+are chosen and for each value $101$ measurement samples are taken.  The
+results for all samples with the same value of \lstinline{N} are equal.  It can
+be observed that for all values of \lstinline{N} the execution of the critical
+section takes four ticks more than the expected value $e_c$.  This is because
+the additional instruction executed by the second call to
+\lstinline{EE_tc_get_CCNT} are not taken into consideration.
+
+Consequently, the expected and the actual execution time differ by
+\unit[40]{ns}.  Besides this differences, the result is as expected and the
+conclusion that the setup is in fact able to measure hardware events on a
+cycle accurate basis can be drawn.
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{c|c c c c}
+    N            & 128    & 1024   & 4096   & 65536 \\
+    \hline
+    $c_e\, [1]$  & 130    & 1026   & 4098   & 65538   \\
+    $c_a\, [1]$  & 134    & 1030   & 4102   & 65542   \\
+    $t_e\, [us]$ & 1.300  & 10.260 & 40.980 & 655.380 \\
+    $t_a\, [us]$ & 1.340  & 10.300 & 41.020 & 655.420 \\
+    samples      & 101    & 101    & 101    & 101     \\
+  \end{tabular}
+  \caption[Trace setup measurement precision]{Experiment to validate the
+  accuracy of the trace setup.  A code snippet that takes a known number of
+  instructions $c_e$ is executed.  Based on the number of instructions the
+  expected execution time $t_e$ can be calculated.  If cycle accurate
+  measurement is supported, the actual execution time $t_a$ should be equal to
+  $t_e$.  The execution times differ by \unit[40]{ns} because the expected
+  number of instructions is off by four cycles.  If this deviation is taken
+  into consideration $t_e$ and $t_a$ coincide.}
+  \label{tab:precision_validation}
+\end{table}
+
+
+\subsection{Systematic Tests}
+\label{subsection:systematic_tests}
+
+In this section test models are created systematically to validate the complete
+software to \gls{btf} event mapping discussed in \autoref{chapter:mapping}.
+For each test application a simulated and a hardware based \gls{btf} trace is
+generated as shown in \autoref{fig:eval_idea}.  The traces are then compared in
+three steps.
+
+\begin{itemize}
+  \item A basic plausibility test based on the Gantt chart of the TA Tool Suite
+  is conducted.
+  \item The semantic equality is validated.
+  \item Different real-time metrics are compared and discussed.
+\end{itemize}
+
+Five test models as shown in the following list are required to cover all
+\gls{btf} actions for which a mapping has been provided.
+
+\begin{itemize}
+  \item task-runnable-signal
+  \item task-event
+  \item task-resource-release-parking
+  \item task-resource-poll-parking
+  \item task-MTA
+\end{itemize}
+
+Each model represents a periodic system where a defined sequence of events is
+executed every \unit[10]{ms}.  UML sequence diagrams \cite{fowler2004uml} are
+used to illustrate the behavior of the test applications during one period.
+
+
+\subsubsection{Task-Runnable-Signal Test}
+
+\begin{figure}[]
+ \centering
+ \centerline{\includegraphics[width=\textwidth]{./media/eval/task_runnable_signal.pdf}}
+ \caption[Task-runnable-signal test sequence]{Test application to validate
+ basic task and signal read and write events.}
+ \label{fig:task_runnable_signal}
+\end{figure}
+
+The task-runnable-signal application is depicted in
+\autoref{fig:task_runnable_signal}.  Task \lstinline{T_1} is activated
+periodically by the stimulus \lstinline{STI_T_1} every \unit[10]{ms}.
+\lstinline{T_1} activates \lstinline{T_2} on another core via \gls{ipa} and
+then executes runnable \lstinline{R_1}.  \lstinline{T_2} executes a runnable
+\lstinline{R_2_1} which executes another runnable \lstinline{R_2_2}.  Once
+execution of \lstinline{R_1} has finished, \lstinline{T_1} activates another
+task \lstinline{T_3} on the second core which has a higher priority then
+\lstinline{T_2}.  Consequently, \lstinline{T_2}, \lstinline{R_2_1}, and
+\lstinline{R_2_2} are preempted as indicated by the light green and light blue
+colors.  \lstinline{T_3} calls a runnable \lstinline{R_3}.  The runnables
+\lstinline{R_1} and \lstinline{R_3} both read and write the signal
+\lstinline{SIG_1}.  Once \lstinline{T_3} has terminate, \lstinline{T_2} and the
+corresponding runnables resume execution.  The purpose of this test application
+is to cover the following \gls{btf} actions:
+
+\begin{itemize}
+  \item Stimulus: trigger by alarm and \gls{ipa}
+  \item Task: activate, start, preempt, resume, terminate
+  \item ISR: activate, start, terminate
+  \item Runnable: start, resume, suspend, terminate
+  \item Signal: read, write
+\end{itemize}
+
+\begin{figure}[]
+ \centering
+ \centerline{\includegraphics[width=\textwidth]{./media/eval/task_runnable_signal.png}}
+ \caption[Task-runnable-signal test gantt chart]{Hardware and software trace
+ for the task-runnable-signal test model.  Attention must be directed to the
+ signal read and write accesses to \lstinline{SIG_1}.  Additionally, the nested
+ runnables must be suspended when the respective task \lstinline{T_2} is
+ preempted.}
+ \label{fig:task_runnable_signal_gantt}
+\end{figure}
+
+Based on the Gantt chart of the TA Tool Suite the \gls{btf} trace can be
+compared visually.  The hardware trace is shown in the upper part and the
+simulated trace in the lower part of each picture.  Both traces use the same
+time scale so that semantic and temporal comparison is feasible.
+
+\autoref{fig:task_runnable_signal_gantt} shows one period of the
+task-runnable-signal test application in the Gantt chart of the \gls{ta} Tool
+Suite.  The figure depicts that \lstinline{R_2_2} is called from the context of
+\lstinline{R_2_1}.  When \lstinline{T_2} is preempted, both runnables must be
+suspended too, indicated by the light blue color in contrast to the stronger
+blue when a runnable is running.  Runnable entities are not shown in the traces
+for the other test models for clarity.  A running task is colored in dark
+green, while preempted tasks are shown in light green.
+
+A separate row in the Gantt chart is used to depict signal accesses from the
+context of tasks.  Whenever a horizontal line is drawn the corresponding signal
+is read or written.  The former is indicated by an arrow pointing up at the
+bottom of the row.  The latter is indicated by an arrow pointing down at the
+top of the row.  It can be seen that the signal accesses are recorded on
+hardware as expected.
+
+The hardware trace shows two additional \glspl{isr} that are not part of the
+simulation trace.  \lstinline{EE_tc_system_timer_handler} is a timer interrupt
+which is executed every \unit[1]{ms} and serves as clock source for the system
+counter. \lstinline{EE_TC_iirq_handler} is used for remote procedure calls.
+
+Two traces can not be semantically identical if entities exist in one trace
+that are not part of the other trace.  There are two ways two solve this
+problem.  Either the \glspl{isr} are added to the system model and therefore
+considered during simulation or all \gls{btf} events related to the
+\glspl{isr} are removed from the hardware trace.
+
+A script that checks the semantic equality of two traces based on the criteria
+established in \autoref{subsection:validation_techniques} is used for the
+second validation step.  However, semantic equality could not be shown for the
+test cases in this and the next section.  The reason for this is discussed in
+\autoref{subsection:randomized_tests}.
+
+The TA Inspector is capable of calculating a variety of real-time metrics based
+on \gls{btf} traces.  Selected metrics are shown to discuss the similarities
+and discrepancies between hardware and simulation trace.  Common metric types
+are activate-to-activate (A2A), response time (RT), net execution time (NET),
+and CPU core load.  The upper part of each metric table shows the hardware
+trace metrics abbreviated by \emph{HW} and the lower part shows the
+simulation trace metrics abbreviated by \emph{Sim}.
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{c c|c c c c}
+          &       &  A2A $[ms]$ & RT $[ms]$ & Load Core\_1 $[\%]$ & Load Core\_2 $[\%]$ \\
+    \hline
+          & T\_1	& 10.005998	& 3.025510	& 30.124423	& 0.000000	\\
+    HW    & T\_2	& 10.005990	& 6.516440	& 0.000000	& 49.950032	\\
+          & T\_3	& 10.005987	& 1.506300	& 0.000000	& 15.000495	\\
+    \hline
+          & Sum   & -         & -         & 30.12     & 64.95     \\
+          &&&&& \\
+          & T\_1  &	10.000000	& 3.000100	& 30.000000	& 0.000000  \\
+    Sim   & T\_2  &	10.000000	& 6.500200	& 0.000000	& 50.000000 \\
+          & T\_3  &	10.000000	& 1.500100	& 0.000000	& 15.000000 \\
+    \hline
+          & Sum   & -         & -         & 30.00     & 65.00     \\
+  \end{tabular}
+  \caption[Task-runnable-signal metrics table]{Metrics of the
+  task-runnable-signal test application.  Activation-to-activation (A2A) and
+  response time (RT) are average values calculated over all instances of the
+  respective entity.}
+  \label{tab:task_runnable_signal}
+\end{table}
+
+\autoref{tab:task_runnable_signal} shows selected real-time metrics for the
+task-runnable-signal application.  In the first approximation all values seem
+identical so the basic configuration of the complete setup is likely to be
+correct.  Nevertheless, the activate-to-activate times between hardware and
+simulation differ by almost \unit[6]{us} which is non-negligible.
+
+The reason for this deviation can be found by examining the
+activate-to-activate times of the timer \gls{isr}
+\lstinline{EE_tc_system_timer_handler}.  The average A2A time for the \gls{isr}
+is \unit[600]{ns} greater than expected.  Since \lstinline{T_1} is activated
+every \unit[10]{ms} or in other words for every tenth instance of the  timer
+\gls{isr}, the expected deviation can be calculated as $d_{A2A} = 10 \cdot
+600\,ns = 6\,us$.
+
+To detect why the A2A times of the timer \gls{isr} diverge, it is necessary to
+read the corresponding source code.  Whenever the timer \gls{isr} is executed
+the time delta to the next instance is calculated based on the current number
+of counter ticks in the timer register.  There is a time delta between the
+point where the last counter ticks value is read and the point where the newly
+calculated value is written.  This is the delta that causes the delay of
+\unit[600]{ns}.  By doubling the frequency the delta reduces to \unit[300]{ns}
+by halving the frequency it increases to \unit[1200]{ns} as expected.
+
+
+\subsubsection{Task-Event Test}
+
+\begin{figure}[]
+ \centering
+ \centerline{\includegraphics[width=\textwidth]{./media/eval/task_event.pdf}}
+ \caption[Task-event test sequence]{Test application to validate \gls{btf}
+ event actions.}
+ \label{fig:task_event}
+\end{figure}
+
+\autoref{fig:task_event} shows the task-event test case.  \lstinline{T_1} is
+activated in the same way as in the first test case.  Again, it activates
+\lstinline{T_2} on a second core via \gls{ipa}.  \lstinline{T_2} executes a
+runnable \lstinline{R_2}.  After execution of the runnable \lstinline{T_2}
+waits for the event \lstinline{EVENT_1}.  Since the event is not set it
+changes into the waiting state indicated by the orange color.  After
+activating \lstinline{T_2}, \lstinline{T_1} executes a runnable \lstinline{R_1}
+and sets the event \lstinline{EVENT_1}.  \lstinline{T_2} returns from the
+waiting state, calls \lstinline{R_2} again, and clears the event
+\lstinline{EVENT_1}.  The purpose of this test application is to cover the
+following \gls{btf} actions:
+
+\begin{itemize}
+  \item Process: wait, release
+  \item Event: wait\_event, set\_event, clear\_event
+\end{itemize}
+
+\begin{figure}[]
+ \centering
+ \centerline{\includegraphics[width=\textwidth]{./media/eval/task_event.png}}
+ \caption[Task-event test gantt chart]{Comparison of hardware (top) and
+ simulated (bottom) trace of the task event test application.}
+ \label{fig:task_event_gantt}
+\end{figure}
+
+\autoref{fig:task_event_gantt} shows the Gantt chart for the task-event test
+case.  As before \lstinline{T_1} is interrupted by the timer \gls{isr} multiple
+times.  A separate row in the Gantt chart is used to indicate the current state
+of the event entity.  An upward pointing arrow indicates that a process starts
+waiting for an event.  The waiting period is colored in orange.  A downward
+pointing arrow indicates that a process sets an event.  Finally, the event is
+cleared which is indicated by an downward pointing arrow in red.
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{c c|c c c c}
+          &       &  A2A $[ms]$ & RT $[ms]$ & CPU Waiting Core\_2 $[\%]$ \\
+    \hline
+      HW  & T\_1	& 10.006198	& 2.023460	& 0.000000 \\
+          & T\_2	& 10.006189	& 3.018570	& 10.046955\\
+      Sim & T\_1	& 10.000000	& 2.000100	& 0.000000 \\
+          & T\_2	& 10.000000	& 3.000100	& 9.999000 \\
+          \hline
+  \end{tabular}
+  \caption[Task-event metrics table]{Metrics of the task-event test application.}
+  \label{tab:task_event}
+\end{table}
+
+\autoref{tab:task_event} shows the resulting metrics for the task-event test
+case.  The activate-to-activate times depict the same behavior like the
+previous test application as expected.  The relative waiting time on hardware
+is greater than it is for the simulated trace.
+
+A possible reason might be the longer runtime of the \lstinline{set_event}
+routine on-target.  The task on core \lstinline{Core_1} sets the event for the
+task on the second core.  Therefore, a \glsdesc{rpc} is necessary to
+set the event.  Since the \gls{rpc} via \lstinline{EE_TC_iirq_handler} is not
+taken into consideration in the simulation, the time in the waiting state is
+longer on hardware.
+
+Response times are also significantly longer on real hardware compared to
+the simulated trace.  The response time measures the period from task
+activation to termination of a task instance.  The difference in response time
+sums up from different factors.
+
+Firstly, the initial ready time, i.e.\ the period from task activation to start
+is longer on hardware.  It takes about \unit[2]{us}.  Secondly, \lstinline{T_1}
+is preempted by the timer \gls{isr} two times.  Category two \glspl{isr}
+require a context switch which costs additional task execution time.  Finally,
+the \gls{ipa} and \lstinline{TaskTerminate} routines take longer on real
+hardware.  By measuring the execution times of the respective system services
+it could be shown that the response times are equal if the measured overhead is
+taken into consideration.  As mentioned before, these effects could be
+respected for the simulation by adding the execution times of the routines to
+the \gls{os} part of the timing model.
+
+
+\subsubsection{Task-Resource Tests}
+
+\begin{figure}[]
+ \centering
+ \centerline{\includegraphics[width=\textwidth]{./media/eval/task_resource_release_polling.pdf}}
+ \caption[Task-resource-poll-parking test sequence]{Test application to validate
+ semaphore events, especially the poll\_parking action.}
+ \label{fig:task_resource_poll_parking}
+\end{figure}
+
+\begin{figure}[]
+ \centering
+ \centerline{\includegraphics[width=\textwidth]{./media/eval/task_resource_release_parking.pdf}}
+ \caption[Task-resource-release-parking test sequence]{Test application to validate
+ semaphore events, especially the release\_parking action.}
+ \label{fig:task_resource_release_parking}
+\end{figure}
+
+The third and fourth test case are similar except for one difference as shown
+in \autoref{fig:task_resource_poll_parking} and
+\autoref{fig:task_resource_release_parking}.  As before, \lstinline{T_1} is
+activated by a periodic stimulus and activates \lstinline{T_2} on another core
+via \gls{ipa}.  \lstinline{T_1} executes the runnable \lstinline{R_1_1} which
+requests the semaphore \lstinline{SEM_1}.  \lstinline{T_2} tries to request the
+same semaphore which is now locked and changes into the active polling state
+indicated by the red color.  As soon as \lstinline{R_1_1} finishes,
+\lstinline{T_1} activates the task \lstinline{T_3} which has a higher priority
+than \lstinline{T_2}, on the second core.  Consequently, \lstinline{T_2} is
+deallocated and changed into the parking state.
+
+At this point the two models differ.  In first model
+\emph{task-resource-poll-parking} \lstinline{T_3} has a shorter execution time
+than in the model \emph{task-resource-release-parking}.  Consequently, in the
+former model \lstinline{T_2} is resumed while \lstinline{SEM_1} is still locked
+and a poll\_parking action takes place.
+
+In the latter case when \lstinline{T_3} has a longer execution time,
+\lstinline{SEM_1} becomes free while \lstinline{T_2} is still preempted.  This
+results in a release\_parking action and \lstinline{T_2} changes into the ready
+state.  Once \lstinline{T_3} has terminated \lstinline{T_2} continues running
+immediately.  The purpose of these applications is it to test the following
+actions.
+
+\begin{itemize}
+  \item Process: park, poll\_parking, release\_parking, poll, run
+  \item Semaphore: ready, lock, unlock, full, overfull
+  \item Process-Semaphore: requestsemaphore, assigned, waiting, released
+\end{itemize}
+
+\begin{figure}[]
+ \centering
+ \centerline{\includegraphics[width=\textwidth]{./media/eval/task_resource_release_polling.png}}
+ \caption[Task-resource-poll-parking test gantt chart]{Comparison of hardware
+ (top) and simulated (bottom) trace of the task-resource-poll-parking test
+ application.}
+ \label{fig:task_resource_poll_parking_gantt}
+\end{figure}
+
+\begin{figure}[]
+ \centering
+ \centerline{\includegraphics[width=\textwidth]{./media/eval/task_resource_release_parking.png}}
+ \caption[Task-resource-release-parking test gantt chart]{Comparison of hardware (top) and
+ simulated (bottom) trace of the task-resource-release-parking test application.}
+ \label{fig:task_resource_release_parking_gantt}
+\end{figure}
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{c c|c c c}
+          &       & RT $[ms]$ & Polling Time $[ms]$ & Parking Time $[ms]$ \\
+    \hline
+          & T\_1 & 2.524897 & 0.000000 & 0.000000 \\
+    HW    & T\_2 & 3.269190 & 0.751730 & 0.508011 \\
+          & T\_3 & 0.506321 & 0.000000 & 0.000000 \\
+    \hline
+          & T\_1 & 2.500140 & 0.000000 & 0.000000 \\
+    Sim   & T\_2 & 3.250040 & 0.749800 & 0.500100 \\
+          & T\_3 & 0.500100 & 0.000000 & 0.000000 \\
+  \end{tabular}
+  \caption[Task-resource-poll-parking metrics table]{Metrics of the
+  task-resource-poll-parking test application.}
+  \label{tab:task_resource_poll_parking}
+\end{table}
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{c c|c c c}
+          &       &  A2A $[ms]$ & RT $[ms]$ & CPU Parking Core\_2 $[\%]$ \\
+    \hline
+          & T\_1 & 10.005997 & 2.026420 & 0.000000 \\
+    HW    & T\_2 & 10.005989 & 2.772670 & 4.984965 \\
+          & T\_3 & 10.005984 & 0.756450 & 0.000000 \\
+    \hline
+          & T\_1 & 10.000000 & 2.000140 & 0.000000 \\
+    Sim   & T\_2 & 10.000000 & 2.750240 & 4.949010 \\
+          & T\_3 & 10.000000 & 0.750100 & 0.000000 \\
+  \end{tabular}
+  \caption[Task-resource-release-parking metrics table]{Metrics of the
+  task-resource-release-parking test application.}
+  \label{tab:task_resource_release_parking}
+\end{table}
+
+
+\autoref{fig:task_resource_poll_parking_gantt} and
+\autoref{fig:task_resource_release_parking_gantt} show the comparison of the
+traces for the two resource test applications.  For both test cases
+\lstinline{T_1} requests \lstinline{SEM_1} as indicated by an upward pointing
+arrow.  The semaphore is now locked and \lstinline{T_2} changes into the
+polling mode when requesting it.  This is indicated by the yellow color.  Once
+\lstinline{T_3} is activated \lstinline{T_2} changes into the parking mode
+indicated by the orange color.
+
+In \autoref{fig:task_resource_poll_parking_gantt} \lstinline{T_3} has a runtime
+of \unit[500]{us} and resumes running before the semaphore is released.  Thus,
+it returns into the polling state until the semaphore is released.  The release
+event is depicted by a downward pointing arrow.
+In \autoref{fig:task_resource_release_parking_gantt} the execution time is longer
+and \lstinline{T_1} releases the semaphore earlier.  Consequently,
+\lstinline{SEM_1} becomes free while \lstinline{T_2} is still deallocated from
+the core and changes into the ready state.
+
+For both resource test applications the \gls{btf} traces recorded from hardware
+match the simulated traces as shown in the previous figures.  The metrics in
+\autoref{tab:task_resource_poll_parking} and
+\autoref{tab:task_resource_release_parking} show similar results compared to the
+previous tables and are therefore not discussed again. 
+
+
+\subsubsection{Task-MTA Test}
+
+\begin{figure}[]
+ \centering
+ \centerline{\includegraphics[width=\textwidth]{./media/eval/task_mta.pdf}}
+ \caption[Task-MTA test sequence]{Test application to validate mtalimitexceeded
+ events.}
+ \label{fig:task_mta}
+\end{figure}
+
+The purpose of the last specified test application is to validate the
+correctness of \gls{mta} and mtalimitexceeded events.  \autoref{fig:task_mta}
+shows the sequence diagram of the respective test model.  In this example
+\lstinline{T_2} is allowed to have two activations.  This means two instances
+of the task may be active in the system at the same point in time.
+
+Like in the previous tests \lstinline{T_1} is activated by \lstinline{STI_T_1}
+periodically.  \lstinline{T_1} then activates \lstinline{T_2} three consecutive
+times via inter-core \gls{ipa}.  The runnable \lstinline{R_1} is executed to
+consume some time between the activations.  After the first activation the task
+starts running as expected.  The second activation is stored by the \gls{os}.
+Once \lstinline{T_2} terminates, it changes into the ready state and starts
+running again.  The third activation is not allowed by the \gls{os} as
+indicated by the red box.  An error message is created and a mtalimitexceeded
+event must be added to the \gls{btf} trace.
+
+\begin{figure}[]
+ \centering
+ \centerline{\includegraphics[width=\textwidth]{./media/eval/task_mta.png}}
+ \caption[Task-MTA test gantt chart]{Comparison of hardware (top) and
+ simulated (bottom) trace of the task-MTA test application.}
+ \label{fig:task_mta_gantt}
+\end{figure}
+
+\autoref{fig:task_mta_gantt} shows the comparison of the \gls{btf} traces
+created by simulation and from hardware for the task-MTA test model.  The
+hardware traces illustrates the procedure for an inter-core process activation
+really well.  At first the activation is triggered on \lstinline{Core_1} as
+shown in the row \lstinline{IPA_T_1}.  This results in the execution of the
+inter-core communication \gls{isr} \lstinline{EE_TC_iirq_handler}.
+
+The \gls{isr} then activates \lstinline{T_2} which changes into the ready state
+indicated by the gray color.  During the second activation \lstinline{T_2} is
+already in the running state.  Consequently, the activation is only illustrated
+by a downward pointing arrow.  In the simulated trace the task keeps running
+during the activation process.  In the hardware trace the task is preempted by
+the inter-core \gls{isr} and the activation takes place while the task is in
+the ready state.
+
+During the third activation two instances of \lstinline{T_2} are already active
+in the system.  Thus, no further activations are allowed and a mtalimitexceeded
+event is created.  This is indicated by a downward pointing red arrow.  At
+around \unit[81925]{us} the first instance of \lstinline{T_2} terminates and
+the next instances becomes ready immediately.  Shortly after that the next
+instance starts running.
+
+
+\subsection{Randomized Tests}
+\label{subsection:randomized_tests}
+
+Randomized tests are used to avoid insufficient test coverage due to selection
+bias in the creation of the test applications.  A tool for generating random
+models automatically with respect to predefined constraints has been developed
+in previous research projects \cite{sailer2014reconstruction}.  It allows the
+creation of an arbitrary number of test models and works with respect to
+user-defined distributions for example,  for the number of cores, tasks, and
+runnables.  Based on these values models can be generated randomly.  
+
+\begin{table}[]
+  \centering
+  \begin{tabular}{c|c c c c}
+    Entities             & min & max & average & distrbution \\
+    \hline
+    Cores $[1]$                   &  2  &  -  &  -      & const       \\  
+    Tasks $[1]$                   &  9  &  22 &  15     & weibull     \\
+    Runnables/Task        $[1]$   &  6  &  13 &  -      & uniform     \\
+    Instructions/Runnable $[10^3]$&  10 &  50 &  30     & weibull     \\
+    Activation  $[ms]$            &  1  &  20 &  1000   & weibull     \\
+    Signals $[1]$                 &  3  &  11 &  17     & weibull     \\
+    Signals/Runnable $[1]$        & 3   &  7  &  -      & uniform     \\
+  \end{tabular}
+  \caption[Randomized model configuration]{The configuration used for creating
+  test models randomly.}
+  \label{tab:rand_config}
+\end{table}
+
+\autoref{tab:rand_config} shows the distributions for the number of entities
+that should be created for each entity type.  This configuration is used for
+each of the ten models that are tested in this section.  The distributions for
+\emph{cores} and \emph{tasks} represent the number of entities of the
+respective type in the system.  The metric \emph{runnables per task} determines
+how many runnables are called from the context of each task.  Each task is
+activated by a periodic stimulus with a period depending on the
+\emph{activation} value.  \emph{Signals} specifies the number of signal
+entities in the system and \emph{signals per runnable} the accesses to these
+signals within the context of each runnable.  Event and resource entities
+cannot be generated by the random model generator and are therefore not covered
+by randomized tests.
+
+Validating these models manually is not feasible.  Therefore, only the semantic
+equality is tested because this can be done without user interaction.  In
+previous work a closed loop model based development process was created to
+conduct the proceeding shown in \autoref{fig:eval_idea} automatically
+\cite{felixproject2}.  This process was extended to support the model generator
+and semantic comparison of two traces.
+
+\begin{figure}[]
+ \centering
+ \centerline{\includegraphics[width=0.55\textwidth]{./media/eval/semantic_impossible.pdf}}
+ \caption[Semantic comparision problem]{Semantic comparison of multi-core
+ systems is not feasible if the execution time of service routines varies
+ between hardware and simulation.}
+ \label{fig:task_runnable_signal}
+\end{figure}
+
+As mentioned before semantic equality could not be shown for any of the test
+applications.  The reason for this is depicted in
+\autoref{fig:task_runnable_signal}.  Assuming that one task activates another
+task on a different core and executes multiple other actions afterwards.  The
+position in which the start event of the second task is added depends on the
+time that vanishes between activation and start.  This means two traces may be
+semantically different even though they show the same behavior.  Consequently,
+the definition of semantic equality used in this thesis is not sufficient for
+the comparison of multi-core systems.  Nevertheless, by randomized comparison
+of the traces the correctness of the mappings could be validated manually.
+However, this fallback solution is not sufficient for validating a wide range
+of test cases.
@@ -0,0 +1,31 @@
+\begin{figure}[]
+ \centering
+ \includegraphics[width=\textwidth]{./media/eval/eval_idea.pdf}
+ \caption[Mapping validation concept]{The general idea for the validation of
+ the software event to \gls{btf} event mapping.  A model that represents a
+ certain system is created.  Based on the model, a simulation, and a hardware
+ trace are generated.  By comparing those traces errors in the transformation
+ process can be detected.}
+ \label{fig:eval_idea}
+\end{figure}
+
+In this chapter the software to system mappings are validated as depicted in
+\autoref{fig:eval_idea}.  A timing model of an application is created and a
+\gls{btf} trace is generated from this model via discrete event simulation.
+The simulated trace represents the expected result for the trace
+recorded from hardware.
+
+Next, C code is generated from the model.  The code is compiled, executed on
+hardware, and the runtime behavior is recorded via hardware tracing.  The
+resulting software level trace is transformed to system level according to
+the respective mappings.  The \gls{btf} trace recorded from hardware is then
+compared to the simulated trace.  Since both traces result from the same timing
+model they are expected to represent the same system behavior.
+
+Nevertheless, two kinds of deviations are expected.  Firstly, timestamps of
+otherwise identical events might differ.  This is unavoidable because
+simulation is an abstraction of reality and is not capable of taking all subtle
+effects influencing the timing on real hardware into consideration.  Secondly,
+events may indicate a different software behavior.  For example, a task starts
+a runnable in one trace but not in the other.   In this case, the deviation
+must be examined because it might point to a mapping error.