scoutfs-utils: add TeX paper

Add the start of a paper that documents the scoutfs design.

Signed-off-by: Zach Brown <zab@versity.com>
This commit is contained in:
Zach Brown
2018-06-25 16:18:09 -07:00
committed by Zach Brown
parent b96feaa5b0
commit 51a48fbbb6
5 changed files with 578 additions and 0 deletions

8
utils/tex/.gitignore vendored Normal file
View File

@@ -0,0 +1,8 @@
missfont.log
*.fls
*.aux
*.d
*.d
*.fdb_latexmk
*.log
*.pdf

33
utils/tex/Makefile Normal file
View File

@@ -0,0 +1,33 @@
#
# # dnf install latexmk texlive
# # make
#
# Tools
LATEXMK = latexmk
RM = rm -f
# Project-specific settings
DOCNAME = scoutfs
# Targets
all: doc
doc: pdf
pdf: $(DOCNAME).pdf
# Rules
%.pdf: %.tex
$(LATEXMK) -pdf -M -MP -MF $*.d $*
mostlyclean:
$(LATEXMK) -silent -c
$(RM) *.bbl
clean: mostlyclean
$(LATEXMK) -silent -C
$(RM) *.run.xml *.synctex.gz
$(RM) *.d
.PHONY: all clean doc mostlyclean pdf
# Include auto-generated dependencies
-include *.d

221
utils/tex/scoutfs.tex Normal file
View File

@@ -0,0 +1,221 @@
% This was derived from the usenix templates, whose introductory
% comment is as follows:
%
% TEMPLATE for Usenix papers, specifically to meet requirements of
% USENIX '05
% originally a template for producing IEEE-format articles using LaTeX.
% written by Matthew Ward, CS Department, Worcester Polytechnic Institute.
% adapted by David Beazley for his excellent SWIG paper in Proceedings,
% Tcl 96
% turned into a smartass generic template by De Clarke, with thanks to
% both the above pioneers
% use at your own risk. Complaints to /dev/null.
% make it two column with no page numbering, default is 10 point
% Munged by Fred Douglis <douglis@research.att.com> 10/97 to separate
% the .sty file from the LaTeX source template, so that people can
% more easily include the .sty file into an existing document. Also
% changed to more closely follow the style guidelines as represented
% by the Word sample file.
% Note that since 2010, USENIX does not require endnotes. If you want
% foot of page notes, don't include the endnotes package in the
% usepackage command, below.
% This version uses the latex2e styles, not the very ancient 2.09 stuff.
\documentclass[letterpaper,twocolumn,10pt]{article}
\usepackage{usenix2019,epsfig}
\begin{document}
%don't want date printed
\date{}
%make title bold and 14 pt font (Latex default is non-bold, 16 pt)
\title{\Large \bf scoutfs : A Scalable Archival Filesystem}
%for single author (just remove % characters)
\author{
{\rm Zach Brown}\\
Versity Software, Inc.
}
\maketitle
% Use the following at camera-ready time to suppress page numbers.
% Comment it out when you first submit the paper for review.
% \thispagestyle{empty}
\section{Metadata Items}
scoutfs stores filesystem metadata in items that are identified by a
key and contain a variable length value payload.\\
Every key uses a generic structure with a fixed number of fields.
{\tt \small
\begin{verbatim}
struct scoutfs_key {
__u8 sk_zone;
__le64 _sk_first;
__u8 sk_type;
__le64 _sk_second;
__le64 _sk_third;
__u8 _sk_fourth;
};
\end{verbatim}
}
Using a shared key struct lets us sort all the metadata items in the
filesystem in one key space regardless of their form or function. The
generic keys are displayed, sorted, and computed (incrementing, finding
difference) without needing to know the specific fields of each item
type.
Different structures are identified by their zone and type pair. They
then map their type's fields to the remaining generic fields to
determine the sorting of the item keys within their type.
For example, when storing inodes we use the {\tt SCOUTFS\_FS\_ZONE} and
{\tt SCOUTFS\_INODE\_TYPE} and put the inode number in the first generic
key field.
{\tt \small
\begin{verbatim}
#define ski_ino _sk_first
\end{verbatim}
}
{\tt \small
\begin{verbatim}
key.sk_zone = SCOUTFS_FS_ZONE;
key.ski_ino = ino;
key.sk_type = SCOUTFS_INODE_TYPE;
\end{verbatim}
}
Continuing this example, metadata that is associated with inodes also
use the {\tt SCOUTFS\_FS\_ZONE} and store the inode number in {\tt
\_sk\_first} but then have different type values. For example {\tt
SCOUTFS\_XATTR\_TYPE} or {\tt SCOUTFS\_SYMLINK\_TYPE}. When the items'
keys are sorted we end up with all the items for a given inode stored
near each other.
\subsection{Directory Entries}
A directory entry is stored in three different metadata items, each with
a different key and used for a different purpose. Each item shares the
same key format and directory entry value payload, however.
The key stores the entry's directory inode number and major and minor
values associated with the type of directory entry being stored.
{\tt \small
\begin{verbatim}
#define skd_ino _sk_first
#define skd_major _sk_second
#define skd_minor _sk_third
\end{verbatim}
}
The value contains a directory entry struct with all the metadata
associated with a directory entry, including the full entry name.
{\tt \small
\begin{verbatim}
struct scoutfs_dirent {
__le64 ino;
__le64 hash;
__le64 pos;
__u8 type;
__u8 name[0];
};
\end{verbatim}
}
Each item contains a full copy of the item value. This duplicates
storage across each item type but also lets each operation be satisfied
by one item lookup. Once the item value is obtained its fields can be
used to construct the keys for each of the items associated with the
entry.
\subsubsection{Directory Entry Lookup Items}
{\tt \small
\begin{verbatim}
key.sk_zone = SCOUTFS_FS_ZONE;
key.skd_ino = dir_ino;
key.sk_type = SCOUTFS_DIRENT_TYPE;
key.skd_major = hash(entry_name);
key.skd_minor = dir_pos;
\end{verbatim}
}
Lookup entries are stored in the parent directory at the hash of the
name of the entry. These entries are used to map names to inode numbers
during path traversal.
The major key value is set to a 64bit hash of the file name. These hash
values can collide so the minor key value is set to the readdir position
in the directory of the entry. This readdir position is unique for
every entry and ensures that keys are unique when hash values collide.
A name lookup is performed by iterating over all the keys with the major
that matches the hashed name. The full name in the dirent value struct
is compared to the search name. It will be very rare to have more than
one item with a given hash value.
\subsubsection{Directory Entry Readdir Items}
{\tt \small
\begin{verbatim}
key.sk_zone = SCOUTFS_FS_ZONE;
key.skd_ino = dir_ino;
key.sk_type = SCOUTFS_READDIR_TYPE;
key.skd_major = dir_pos;
key.skd_minor = 0;
\end{verbatim}
}
Readdir entries are used to iterate over entries for the readdir()
call. By providing a unique 64bit {\tt dir\_pos} for each entry we avoid
having to track multiple entries for a given readdir position value.
readdir() returns entries in {\tt dir\_pos} order which depends on entry
creation order and matches inode allocation order. Accessing the inodes
that are referenced by the entries returned from readdir() will result
in efficient forward iteration over the readdir and inode items,
assuming that files were simply created.
Renaming files or creating hard links to existing files creates a new
entry but can't reassign the inode number and can result in mismatched
access patterns of the readdir entry items and the inode items.
\subsubsection{Directory Entry Link Backref Items}
{\tt \small
\begin{verbatim}
key.sk_zone = SCOUTFS_FS_ZONE;
key.skd_ino = target_ino;
key.sk_type = SCOUTFS_LINK_BACKREF_TYPE;
key.skd_major = dir_ino;
key.skd_minor = dir_pos;
\end{verbatim}
}
Link backref entry items are stored with the target inode number and the
inode number and readdir position of the entry in its directory.
They're used to iterate over all the entries that refer to a given
inode. Full relative paths from the root directory to a target inode
can be constructed by walking up through each parent entry as its
discovered.
Both inode numbers and readdir positions are allocated by strictly
increasing the next free number. Old inode numbers or readdir positions
are never reused. This means that resolving paths for existing inodes
will always walk keys that are strictly sorted less than the keys that
will be created as new files are created. This tends to isolate read
access patterns during backround archival policy processing from write
access patterns during new file creation and increases performance by
reducing contention.
\end{document}

97
utils/tex/usenix2019.sty Normal file
View File

@@ -0,0 +1,97 @@
% usenix.sty - to be used with latex2e for USENIX.
% To use this style file, look at the template usenix_template.tex
%
% $Id: usenix.sty,v 1.2 2005/02/16 22:30:47 maniatis Exp $
%
% The following definitions are modifications of standard article.sty
% definitions, arranged to do a better job of matching the USENIX
% guidelines.
% It will automatically select two-column mode and the Times-Roman
% font.
%
% USENIX papers are two-column.
% Times-Roman font is nice if you can get it (requires NFSS,
% which is in latex2e.
\if@twocolumn\else\input twocolumn.sty\fi
\usepackage{mathptmx} % times roman, including math (where possible)
%
% USENIX wants margins of: 0.75" sides, 1" bottom, and 1" top.
% 0.33" gutter between columns.
% Gives active areas of 7" x 9"
%
\setlength{\textheight}{9.0in}
\setlength{\columnsep}{0.33in}
\setlength{\textwidth}{7.00in}
\setlength{\topmargin}{0.0in}
\setlength{\headheight}{0.0in}
\setlength{\headsep}{0.0in}
\addtolength{\oddsidemargin}{-0.25in}
\addtolength{\evensidemargin}{-0.25in}
% Usenix wants no page numbers for camera-ready papers, so that they can
% number them themselves. But submitted papers should have page numbers
% for the reviewers' convenience.
%
%
% \pagestyle{empty}
%
% Usenix titles are in 14-point bold type, with no date, and with no
% change in the empty page headers. The whole author section is 12 point
% italic--- you must use {\rm } around the actual author names to get
% them in roman.
%
\def\maketitle{\par
\begingroup
\renewcommand\thefootnote{\fnsymbol{footnote}}%
\def\@makefnmark{\hbox to\z@{$\m@th^{\@thefnmark}$\hss}}%
\long\def\@makefntext##1{\parindent 1em\noindent
\hbox to1.8em{\hss$\m@th^{\@thefnmark}$}##1}%
\if@twocolumn
\twocolumn[\@maketitle]%
\else \newpage
\global\@topnum\z@
\@maketitle \fi\@thanks
\endgroup
\setcounter{footnote}{0}%
\let\maketitle\relax
\let\@maketitle\relax
\gdef\@thanks{}\gdef\@author{}\gdef\@title{}\let\thanks\relax}
\def\@maketitle{\newpage
\vbox to 2.5in{
\vspace*{\fill}
\vskip 2em
\begin{center}%
{\Large\bf \@title \par}%
\vskip 0.375in minus 0.300in
{\large\it
\lineskip .5em
\begin{tabular}[t]{c}\@author
\end{tabular}\par}%
\end{center}%
\par
\vspace*{\fill}
% \vskip 1.5em
}
}
%
% The abstract is preceded by a 12-pt bold centered heading
\def\abstract{\begin{center}%
{\large\bf \abstractname\vspace{-.5em}\vspace{\z@}}%
\end{center}}
\def\endabstract{}
%
% Main section titles are 12-pt bold. Others can be same or smaller.
%
\def\section{\@startsection {section}{1}{\z@}{-3.5ex plus-1ex minus
-.2ex}{2.3ex plus.2ex}{\reset@font\large\bf}}

219
utils/tex/usenix2019.tex Normal file
View File

@@ -0,0 +1,219 @@
% TEMPLATE for Usenix papers, specifically to meet requirements of
% USENIX '05
% originally a template for producing IEEE-format articles using LaTeX.
% written by Matthew Ward, CS Department, Worcester Polytechnic Institute.
% adapted by David Beazley for his excellent SWIG paper in Proceedings,
% Tcl 96
% turned into a smartass generic template by De Clarke, with thanks to
% both the above pioneers
% use at your own risk. Complaints to /dev/null.
% make it two column with no page numbering, default is 10 point
% Munged by Fred Douglis <douglis@research.att.com> 10/97 to separate
% the .sty file from the LaTeX source template, so that people can
% more easily include the .sty file into an existing document. Also
% changed to more closely follow the style guidelines as represented
% by the Word sample file.
% Note that since 2010, USENIX does not require endnotes. If you want
% foot of page notes, don't include the endnotes package in the
% usepackage command, below.
% This version uses the latex2e styles, not the very ancient 2.09 stuff.
\documentclass[letterpaper,twocolumn,10pt]{article}
\usepackage{usenix2019,epsfig,endnotes}
\begin{document}
%don't want date printed
\date{}
%make title bold and 14 pt font (Latex default is non-bold, 16 pt)
\title{\Large \bf Wonderful : A Terrific Application and Fascinating Paper}
%for single author (just remove % characters)
\author{
{\rm Your N.\ Here}\\
Your Institution
\and
{\rm Second Name}\\
Second Institution
% copy the following lines to add more authors
% \and
% {\rm Name}\\
%Name Institution
} % end author
\maketitle
% Use the following at camera-ready time to suppress page numbers.
% Comment it out when you first submit the paper for review.
\thispagestyle{empty}
\subsection*{Abstract}
Your Abstract Text Goes Here. Just a few facts.
Whet our appetites.
\section{Introduction}
A paragraph of text goes here. Lots of text. Plenty of interesting
text. \\
More fascinating text. Features\endnote{Remember to use endnotes, not footnotes!} galore, plethora of promises.\\
\section{This is Another Section}
Some embedded literal typset code might
look like the following :
{\tt \small
\begin{verbatim}
int wrap_fact(ClientData clientData,
Tcl_Interp *interp,
int argc, char *argv[]) {
int result;
int arg0;
if (argc != 2) {
interp->result = "wrong # args";
return TCL_ERROR;
}
arg0 = atoi(argv[1]);
result = fact(arg0);
sprintf(interp->result,"%d",result);
return TCL_OK;
}
\end{verbatim}
}
Now we're going to cite somebody. Watch for the cite tag.
Here it comes~\cite{Chaum1981,Diffie1976}. The tilde character (\~{})
in the source means a non-breaking space. This way, your reference will
always be attached to the word that preceded it, instead of going to the
next line.
\section{This Section has SubSections}
\subsection{First SubSection}
Here's a typical figure reference. The figure is centered at the
top of the column. It's scaled. It's explicitly placed. You'll
have to tweak the numbers to get what you want.\\
% you can also use the wonderful epsfig package...
\begin{figure}[t]
\begin{center}
\begin{picture}(300,150)(0,200)
\put(-15,-30){\special{psfile = fig1.ps hscale = 50 vscale = 50}}
\end{picture}\\
\end{center}
\caption{Wonderful Flowchart}
\end{figure}
This text came after the figure, so we'll casually refer to Figure 1
as we go on our merry way.
\subsection{New Subsection}
It can get tricky typesetting Tcl and C code in LaTeX because they share
a lot of mystical feelings about certain magic characters. You
will have to do a lot of escaping to typeset curly braces and percent
signs, for example, like this:
``The {\tt \%module} directive
sets the name of the initialization function. This is optional, but is
recommended if building a Tcl 7.5 module.
Everything inside the {\tt \%\{, \%\}}
block is copied directly into the output. allowing the inclusion of
header files and additional C code." \\
Sometimes you want to really call attention to a piece of text. You
can center it in the column like this:
\begin{center}
{\tt \_1008e614\_Vector\_p}
\end{center}
and people will really notice it.\\
\noindent
The noindent at the start of this paragraph makes it clear that it's
a continuation of the preceding text, not a new para in its own right.
Now this is an ingenious way to get a forced space.
{\tt Real~$*$} and {\tt double~$*$} are equivalent.
Now here is another way to call attention to a line of code, but instead
of centering it, we noindent and bold it.\\
\noindent
{\bf \tt size\_t : fread ptr size nobj stream } \\
And here we have made an indented para like a definition tag (dt)
in HTML. You don't need a surrounding list macro pair.
\begin{itemize}
\item[] {\tt fread} reads from {\tt stream} into the array {\tt ptr} at
most {\tt nobj} objects of size {\tt size}. {\tt fread} returns
the number of objects read.
\end{itemize}
This concludes the definitions tag.
\subsection{How to Build Your Paper}
You have to run {\tt latex} once to prepare your references for
munging. Then run {\tt bibtex} to build your bibliography metadata.
Then run {\tt latex} twice to ensure all references have been resolved.
If your source file is called {\tt usenixTemplate.tex} and your {\tt
bibtex} file is called {\tt usenixTemplate.bib}, here's what you do:
{\tt \small
\begin{verbatim}
latex usenixTemplate
bibtex usenixTemplate
latex usenixTemplate
latex usenixTemplate
\end{verbatim}
}
\subsection{Last SubSection}
Well, it's getting boring isn't it. This is the last subsection
before we wrap it up.
\section{Acknowledgments}
A polite author always includes acknowledgments. Thank everyone,
especially those who funded the work.
\section{Availability}
It's great when this section says that MyWonderfulApp is free software,
available via anonymous FTP from
\begin{center}
{\tt ftp.site.dom/pub/myname/Wonderful}\\
\end{center}
Also, it's even greater when you can write that information is also
available on the Wonderful homepage at
\begin{center}
{\tt http://www.site.dom/\~{}myname/SWIG}
\end{center}
Now we get serious and fill in those references. Remember you will
have to run latex twice on the document in order to resolve those
cite tags you met earlier. This is where they get resolved.
We've preserved some real ones in addition to the template-speak.
After the bibliography you are DONE.
{\footnotesize \bibliographystyle{acm}
\bibliography{../common/bibliography}}
\theendnotes
\end{document}