<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
    <channel>
        <title>Theia Vogel&#x27;s website &amp; blog</title>
        <link>https://vgel.me</link>
        <description>Blog about linguistics, programming, and my projects</description>
        <generator>Zola</generator>
        <language>en</language>
        <atom:link href="https://vgel.me/feed.xml" rel="self" type="application/rss+xml"/>
        <lastBuildDate>Mon, 22 Jan 2024 00:00:00 +0000</lastBuildDate>
            
            <item>
                <title>Representation Engineering Mistral-7B an Acid Trip</title>
                <pubDate>Mon, 22 Jan 2024 00:00:00 +0000</pubDate>
                <link>https://vgel.me/posts/representation-engineering/</link>
                <guid>https://vgel.me/posts/representation-engineering/</guid>
                <description>&lt;p&gt;In October 2023, a group of authors from the Center for AI Safety, among others, published &lt;a href=&quot;https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2310.01405&quot;&gt;Representation Engineering: A Top-Down Approach to AI Transparency&lt;&#x2F;a&gt;.
That paper looks at a few methods of doing what they call &amp;quot;Representation Engineering&amp;quot;: calculating a &amp;quot;control vector&amp;quot; that can be read from or added to model activations &lt;em&gt;during inference&lt;&#x2F;em&gt; to interpret or control the model&#x27;s behavior, without prompt engineering or finetuning. (There was also some similar work published in May 2023 on &lt;a href=&quot;https:&#x2F;&#x2F;www.lesswrong.com&#x2F;posts&#x2F;5spBue2z2tw4JuDCx&#x2F;steering-gpt-2-xl-by-adding-an-activation-vector&quot;&gt;steering GPT-2-XL&lt;&#x2F;a&gt;.)&lt;&#x2F;p&gt;
&lt;p&gt;Being Responsible AI Safety and INterpretability researchers (RAISINs), they mostly focused on things like &amp;quot;reading off whether a model is power-seeking&amp;quot; and &amp;quot;adding a happiness vector can make the model act so giddy that it forgets pipe bombs are bad.&amp;quot;
&lt;small&gt;They also &lt;a href=&quot;https:&#x2F;&#x2F;github.com&#x2F;andyzoujm&#x2F;representation-engineering&quot;&gt;released their code on Github&lt;&#x2F;a&gt;.&lt;&#x2F;small&gt;&lt;&#x2F;p&gt;
&lt;p&gt;&lt;small&gt;(If this all sounds strangely familiar, it may be because &lt;a href=&quot;https:&#x2F;&#x2F;www.astralcodexten.com&#x2F;p&#x2F;the-road-to-honest-ai&quot;&gt;Scott Alexander covered it in the 1&#x2F;8&#x2F;24 MAM&lt;&#x2F;a&gt;.)&lt;&#x2F;small&gt;&lt;&#x2F;p&gt;
&lt;p&gt;But there was a lot they didn&#x27;t look into outside of the safety stuff.
How do control vectors compare to plain old prompt engineering?
What happens if you make a control vector for &amp;quot;high on acid&amp;quot;?
Or &amp;quot;lazy&amp;quot; and &amp;quot;hardworking?
Or &amp;quot;extremely self-aware&amp;quot;?
And has the author of this blog post published a PyPI package so you can very easily make your own control vectors in less than sixty seconds?
(&lt;a href=&quot;https:&#x2F;&#x2F;github.com&#x2F;vgel&#x2F;repeng&#x2F;&quot;&gt;Yes, I did!&lt;&#x2F;a&gt;)&lt;&#x2F;p&gt;
&lt;p&gt;So keep reading, because it turns out after all that, control vectors are… well… &lt;em&gt;awesome&lt;&#x2F;em&gt; for controlling models and getting them to do what you want.&lt;sup class=&quot;footnote-reference&quot;&gt;&lt;a href=&quot;#alignment&quot;&gt;2&lt;&#x2F;a&gt;&lt;&#x2F;sup&gt;&lt;&#x2F;p&gt;
</description>
            </item>
            
            
            <item>
                <title>Outside</title>
                <pubDate>Fri, 12 Jan 2024 00:00:00 +0000</pubDate>
                <link>https://vgel.me/fiction/outside/</link>
                <guid>https://vgel.me/fiction/outside/</guid>
                <description>&lt;p&gt;Dear President Carter,&lt;&#x2F;p&gt;
&lt;p&gt;I once knew the name of every tree in the world, before it fell apart.&lt;&#x2F;p&gt;
&lt;p&gt;I was born on an island in the Southeast Pacific, though we did not call it that, closest to Antarctica and Chile, but still far from both. We were fortunate: small, only half a day&#x27;s walk across, and far in the southern cold. We did not attract explorers and banana farmers, like other, less fortunate islands. Until the invention of the satellite, we were not known to exist. We were &amp;quot;uncontacted&amp;quot; in your parlance, though I have come to learn that even this status is not unique.&lt;&#x2F;p&gt;
&lt;p&gt;Our world, our cosmology, would seem simple to you. A small, rocky island surrounded by an endless ocean. One crop to master, a hardy and nutritious kind of sweet potato.&lt;&#x2F;p&gt;
&lt;p&gt;But that does not mean that &lt;em&gt;we&lt;&#x2F;em&gt; were simple! The human mind has a fixed capacity for detail. When there is more, it backs up, takes in a wider view. When there is less, it steps closer, and makes a finer distinction.&lt;&#x2F;p&gt;
</description>
            </item>
            
            
            <item>
                <title>The Transmission from NOMAD-874</title>
                <pubDate>Thu, 11 Jan 2024 00:00:00 +0000</pubDate>
                <link>https://vgel.me/fiction/the-transmission-from-nomad-874/</link>
                <guid>https://vgel.me/fiction/the-transmission-from-nomad-874/</guid>
                <description>&lt;p&gt;An experimental project: interactive fiction in an OpenAI &amp;quot;GPT&amp;quot;, a customized version of GPT-4.
A short mystery where you figure out why a strange transmission has arrived on the independent planet of Kapat.
Alternatively, ignore the main quest and ask ELIZA&#x2F;300 about Kapatian politics and satlink your friends!
Click below to try it (you will need a ChatGPT subscription):&lt;&#x2F;p&gt;
&lt;div style=&quot;text-align: center&quot;&gt;
&lt;a href=&quot;https:&#x2F;&#x2F;chat.openai.com&#x2F;g&#x2F;g-6xanzxdxu-the-transmission-from-nomad-874&quot;&gt;
&lt;img style=&quot;max-width: min(600px, 100%)&quot; src=&quot;&#x2F;fiction&#x2F;the-transmission-from-nomad-874&#x2F;gpt.png&quot;&gt;
&lt;&#x2F;a&gt;
&lt;&#x2F;div&gt;
</description>
            </item>
            
            
            <item>
                <title>Tadpole</title>
                <pubDate>Wed, 10 Jan 2024 00:00:00 +0000</pubDate>
                <link>https://vgel.me/fiction/tadpole/</link>
                <guid>https://vgel.me/fiction/tadpole/</guid>
                <description>&lt;div style=&quot;text-align: center&quot;&gt;
&lt;a href=&quot;&#x2F;fiction&#x2F;tadpole&#x2F;tadpole.png&quot;&gt;&lt;img style=&quot;max-width: min(600px, 100%)&quot; src=&quot;&#x2F;fiction&#x2F;tadpole&#x2F;tadpole.png&quot;&gt;&lt;&#x2F;a&gt;
&lt;&#x2F;div&gt;
</description>
            </item>
            
            
            
            <item>
                <title>How to make LLMs go fast</title>
                <pubDate>Mon, 18 Dec 2023 00:00:00 +0000</pubDate>
                <link>https://vgel.me/posts/faster-inference/</link>
                <guid>https://vgel.me/posts/faster-inference/</guid>
                <description>&lt;style&gt;
div.batch {
    width: fit-content;
    display: flex;
    flex-direction: column;
    margin-bottom: 25px;
    align-items: center;
}
div.batch div.batchmatrix {
    margin-bottom: 0;
    width: 100%;
}
div.batch td { font-size: 50%; }

div.batchmatrix {
    width: fit-content;
    max-width: 100%;
    overflow-x: auto;
    display: flex;
    flex-flow: row wrap;
    align-items: center;
    gap: 0.5em;
    margin-bottom: 25px;
}
div.batchmatrix table { margin: 0; }
div.batchmatrix &gt; div { display: flex; flex-flow: row wrap; align-items: center; gap: 0.5em; }

div.batchmatrix &gt; div, div.batchmatrix &gt; table { margin: auto; }

div.batchmatrix th, div.batchmatrix td {
    width: 1em;
    height: 1em;
    border: 1px solid black;
}

div.batchmatrix td.x { background-color: rgb(255, 150, 150); }
div.batchmatrix td.l { background-color: rgb(150, 150, 255); }
div.batchmatrix td.n { background-color: rgb(150, 255, 150); }
div.batchmatrix th   {
    background-color: rgb(150, 150, 150);
    border-right: none;
    border-left: none;
}
div.batchmatrix th:first-child { border-left: 1px solid black; }
div.batchmatrix th:last-child { border-right: 1px solid black; }

table.embedding td:last-child { border: none; }

div.batchmatrix div.op {
    font-weight: bold;
    min-width: 2em;
    height: 2em;
    display: flex;
    place-content: center;
}

div.matrix {
    width: fit-content;
    max-width: 100%;
    overflow-x: auto;
    display: flex;
    flex-flow: row wrap;
    align-items: center;
    gap: 0.5em;
    margin: 0 auto;
    margin-bottom: 25px;
}
div.matrix table { margin: 0; }
div.matrix &gt; div { display: flex; flex-flow: row wrap; align-items: center; gap: 0.5em; }

div.matrix &gt; div, div.matrix &gt; table { margin: auto; }

div.matrix th, div.matrix td {
    width: 1em;
    height: 1em;
    border: 1px solid black;
}

div.matrix td.x { background-color: rgb(255, 150, 150); }
div.matrix td.l { background-color: rgb(150, 150, 255); font-size: 0.5em; }
div.matrix td.n { background-color: rgb(150, 255, 150); }
div.matrix th   {
    background-color: rgb(150, 150, 150);
    border-right: none;
    border-left: none;
}
div.matrix th:first-child { border-left: 1px solid black; }
div.matrix th:last-child { border-right: 1px solid black; }

table.embedding td:last-child { border: none; }
table.tokenqkv tr &gt; *:nth-child(8n + 1):not(:last-child) {
    border-right: 3px solid black;
}

div.matrix div.op {
    font-weight: bold;
    min-width: 2em;
    height: 2em;
    display: flex;
    place-content: center;
}

table.attnhead td:nth-child(2n):not(:last-child) {
    border-right: 3px solid black;
}

div.speculative-decoding { 
  border: 1px solid black;
  padding: 1rem;
  margin-bottom: 2rem;
}

div.speculative-decoding &gt; span {
  padding: 2px;
  margin-right: 2px;
  border-radius: 5px;
  white-space: pre-wrap;
}

div.speculative-decoding &gt; span.draft-token {
  background-color: rgb(200, 255, 200);
}

div.speculative-decoding &gt; span.oracle-token {
  background-color: rgb(200, 200, 255);
}

p.grammar-gen {
  white-space: pre-wrap;
  font-family: monospace;
}

p.grammar-gen span {
  background-color: #d2f4d3;
}
&lt;&#x2F;style&gt;
&lt;p&gt;In &lt;a href=&quot;&#x2F;posts&#x2F;handmade-transformer&quot;&gt;my last post&lt;&#x2F;a&gt;, we made a transformer by hand.
There, we used the classic autoregressive sampler, along the lines of:&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;python&quot; style=&quot;background-color:#272822;color:#f8f8f2;&quot; class=&quot;language-python &quot;&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span style=&quot;font-style:italic;color:#f92672;&quot;&gt;def &lt;&#x2F;span&gt;&lt;span style=&quot;color:#a6e22e;&quot;&gt;generate&lt;&#x2F;span&gt;&lt;span&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;font-style:italic;color:#fd971f;&quot;&gt;prompt&lt;&#x2F;span&gt;&lt;span&gt;: &lt;&#x2F;span&gt;&lt;span style=&quot;font-style:italic;color:#66d9ef;&quot;&gt;str&lt;&#x2F;span&gt;&lt;span&gt;, &lt;&#x2F;span&gt;&lt;span style=&quot;font-style:italic;color:#fd971f;&quot;&gt;tokens_to_generate&lt;&#x2F;span&gt;&lt;span&gt;: &lt;&#x2F;span&gt;&lt;span style=&quot;font-style:italic;color:#66d9ef;&quot;&gt;int&lt;&#x2F;span&gt;&lt;span&gt;) -&amp;gt; &lt;&#x2F;span&gt;&lt;span style=&quot;font-style:italic;color:#66d9ef;&quot;&gt;str&lt;&#x2F;span&gt;&lt;span&gt;:
&lt;&#x2F;span&gt;&lt;span&gt;    tokens &lt;&#x2F;span&gt;&lt;span style=&quot;color:#f92672;&quot;&gt;= &lt;&#x2F;span&gt;&lt;span&gt;tokenize(prompt)
&lt;&#x2F;span&gt;&lt;span&gt;    &lt;&#x2F;span&gt;&lt;span style=&quot;color:#f92672;&quot;&gt;for &lt;&#x2F;span&gt;&lt;span&gt;i &lt;&#x2F;span&gt;&lt;span style=&quot;color:#f92672;&quot;&gt;in &lt;&#x2F;span&gt;&lt;span style=&quot;color:#66d9ef;&quot;&gt;range&lt;&#x2F;span&gt;&lt;span&gt;(tokens_to_generate):
&lt;&#x2F;span&gt;&lt;span&gt;        next_token &lt;&#x2F;span&gt;&lt;span style=&quot;color:#f92672;&quot;&gt;= &lt;&#x2F;span&gt;&lt;span&gt;model(tokens)
&lt;&#x2F;span&gt;&lt;span&gt;        tokens.append(next_token)
&lt;&#x2F;span&gt;&lt;span&gt;    &lt;&#x2F;span&gt;&lt;span style=&quot;color:#f92672;&quot;&gt;return &lt;&#x2F;span&gt;&lt;span&gt;detokenize(tokens)
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;This approach to inference is elegant and cuts to the heart of how LLMs work—they&#x27;re &lt;em&gt;autoregressive&lt;&#x2F;em&gt;, consuming their own output.
And for our toy model with merely thousands of parameters, it worked completely fine.
Unfortunately, for real models it&#x27;s far too slow.
Why is that, and how can we make it faster?&lt;&#x2F;p&gt;
&lt;p&gt;This post is a long and wide-ranging survey of a bunch of different ways to make LLMs go brrrr, from better hardware utilization to clever decoding tricks.
It&#x27;s not completely exhaustive, and isn&#x27;t the most in-depth treatment of every topic—I&#x27;m not an expert on all these things!
But hopefully you&#x27;ll find the information here a useful jumping off point to learn more about the topics you&#x27;re interested in.
(I tried to include links to relevant papers and blog posts where applicable.)&lt;&#x2F;p&gt;
</description>
            </item>
            
            
            
            <item>
                <title>I made a transformer by hand (no training!)</title>
                <pubDate>Mon, 11 Sep 2023 00:00:00 +0000</pubDate>
                <link>https://vgel.me/posts/handmade-transformer/</link>
                <guid>https://vgel.me/posts/handmade-transformer/</guid>
                <description>&lt;style&gt;
div.matrix {
    width: fit-content;
    max-width: 100%;
    overflow-x: auto;
    display: flex;
    flex-flow: row wrap;
    align-items: center;
    gap: 0.5em;
    margin: 0 auto;
    margin-bottom: 25px;
}
div.matrix table { margin: 0; }
div.matrix &gt; div { display: flex; flex-flow: row wrap; align-items: center; gap: 0.5em; }

div.matrix &gt; div, div.matrix &gt; table { margin: auto; }

div.matrix th, div.matrix td {
    width: 1em;
    height: 1em;
    border: 1px solid black;
}

div.matrix td.x { background-color: rgb(255, 150, 150); }
div.matrix td.l { background-color: rgb(150, 150, 255); font-size: 0.5em; }
div.matrix td.n { background-color: rgb(150, 255, 150); }
div.matrix th   {
    background-color: rgb(150, 150, 150);
    border-right: none;
    border-left: none;
}
div.matrix th:first-child { border-left: 1px solid black; }
div.matrix th:last-child { border-right: 1px solid black; }

table.embedding td:last-child { border: none; }
table.qkv tr &gt; *:nth-child(8n):not(:last-child) {
    border-right: 3px solid black;
}

div.matrix div.op {
    font-weight: bold;
    min-width: 2em;
    height: 2em;
    display: flex;
    place-content: center;
}

details {
  width: 100%;  
  border: 1px solid black;
  margin-bottom: 25px;
  padding: 0 0.5em;
}

details &gt; summary {
  text-align: center;
}
&lt;&#x2F;style&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Intended audience: some familiarity with language models, interested in how transformers do stuff (but might be a bit rusty on matrices)&lt;&#x2F;em&gt;&lt;&#x2F;p&gt;
&lt;&#x2F;blockquote&gt;
&lt;p&gt;I&#x27;ve been wanting to understand transformers and attention better for awhile now—I&#x27;d read The Illustrated Transformer, but still didn&#x27;t feel like I had an intuitive understanding of what the various pieces of attention were &lt;em&gt;doing&lt;&#x2F;em&gt;.
What&#x27;s the difference between &lt;code&gt;q&lt;&#x2F;code&gt; and &lt;code&gt;k&lt;&#x2F;code&gt;?
And don&#x27;t even get me started on &lt;code&gt;v&lt;&#x2F;code&gt;!&lt;&#x2F;p&gt;
&lt;p&gt;So I decided to make a transformer to predict a simple sequence &lt;small&gt;(specifically, a decoder-only transformer with a similar architecture to GPT-2)&lt;&#x2F;small&gt; manually—not by training one, or using pretrained weights, but instead by &lt;em&gt;assigning each weight, by hand&lt;&#x2F;em&gt;, over an evening.
And—it worked!
I feel like I understand transformers much better now, and hopefully after reading this, so will you.&lt;&#x2F;p&gt;
</description>
            </item>
            
            
            <item>
                <title>Writing a C compiler in 500 lines of Python</title>
                <pubDate>Wed, 30 Aug 2023 00:00:00 +0000</pubDate>
                <link>https://vgel.me/posts/c500/</link>
                <guid>https://vgel.me/posts/c500/</guid>
                <description>&lt;p&gt;A few months ago, I set myself the challenge of writing a C compiler in 500 lines of Python, after writing my &lt;a href=&quot;&#x2F;posts&#x2F;donut&#x2F;&quot;&gt;SDF donut&lt;&#x2F;a&gt; post.
How hard could it be?
The answer was, pretty hard, even when dropping quite a few features.
But it was also pretty interesting, and the result is surprisingly functional and not too hard to understand!&lt;&#x2F;p&gt;
&lt;p&gt;There&#x27;s too much code for me to comprehensively cover in a single blog post&lt;sup class=&quot;footnote-reference&quot;&gt;&lt;a href=&quot;#yak&quot;&gt;2&lt;&#x2F;a&gt;&lt;&#x2F;sup&gt;, so I&#x27;ll just give an overview of the decisions I made, things I had to cut, and the general architecture of the compiler, touching on a representative piece of each part.
Hopefully after reading this post, &lt;a href=&quot;https:&#x2F;&#x2F;github.com&#x2F;vgel&#x2F;c500&#x2F;blob&#x2F;main&#x2F;compiler.py&quot;&gt;the code&lt;&#x2F;a&gt; is more approachable!&lt;&#x2F;p&gt;
</description>
            </item>
            
            
            <item>
                <title>silurians</title>
                <pubDate>Wed, 23 Aug 2023 00:00:00 +0000</pubDate>
                <link>https://vgel.me/fiction/silurians/</link>
                <guid>https://vgel.me/fiction/silurians/</guid>
                <description>&lt;style&gt;
.slr-messages {
  --sentColor: #0b93f6;
  --recvColor: #e5e5ea;
  --bg: #fff;

  background-color: var(--bg);
  display: flex;
  flex-direction: column;
  max-width: 450px;
  margin: 0 auto;
  padding: 27px;
  &#x2F;* padding: 0; *&#x2F;
  list-style: none;
}
.slr-messages &gt; p {
  position: relative; &#x2F;* Setup a relative container for our psuedo elements *&#x2F;
  max-width: 255px;
  margin-bottom: 15px;
  padding: 10px 20px;
  line-height: 24px;
  word-wrap: break-word; &#x2F;* Make sure the text wraps to multiple lines if long *&#x2F;
  border-radius: 25px;
}
.slr-messages &gt; p.img {
    padding: 0;
}
.slr-messages &gt; p:before { width: 20px; }
.slr-messages &gt; p:after {
    width: 26px;
    background-color: var(--bg); &#x2F;* All tails have the same bg cutout *&#x2F;
}
.slr-messages &gt; p:before, .slr-messages &gt; p:after {
    position: absolute;
    bottom: 0;
    height: 25px; &#x2F;* height of our bubble &quot;tail&quot; - should match the border-radius above *&#x2F;
    content: &#x27;&#x27;;
}
.slr-messages &gt; p.sent {
    align-self: flex-end;
    color: white;
    background: var(--sentColor);
}
.slr-messages &gt; p.sent:before {
    right: -7px;
    background-color: var(--sentColor);
    border-bottom-left-radius: 16px 14px;
}
.slr-messages &gt; p.sent:after {
    right: -26px;
    border-bottom-left-radius: 10px;
}
.slr-messages &gt; p.recv {
    align-self: flex-start;
    color: black;
    background: var(--recvColor);
}
.slr-messages &gt; p.recv:before {
    left: -7px;
    background-color: var(--recvColor);
    border-bottom-right-radius: 16px 14px;
}
.slr-messages &gt; p.recv:after {
    left: -26px;
    border-bottom-right-radius: 10px;
}
.slr-messages &gt; p.notail { margin-bottom: 2px; }
.slr-messages &gt; p.notail:before, .slr-messages &gt; p.notail:after {
    opacity: 0;
}
.slr-messages &gt; span.time {
    font-size: 0.8em;
    text-align: center;
    margin-bottom: 10px;
}
.slr-messages &gt; p.img { background-color: var(--bg); }
.slr-messages &gt; p.img &gt; img { border-radius: 25px; margin: 0; }
.slr-messages &gt; p.link { width: 80%; }
.slr-messages &gt; p.link &gt; a {
    display: flex;
    flex-direction: row;
    gap: 5px;
    align-items: center;
    height: 40px;
    color: inherit;
}
.slr-messages &gt; p.link &gt; a:hover { text-decoration: none; border-bottom: none; }
.slr-messages &gt; p.link &gt; a &gt; img {
    float: left;
    background-color: var(--bg);
    border-radius: 5px;
    margin: 5px;
    height: 26px;
}
&lt;&#x2F;style&gt;
&lt;div class=&quot;slr-messages&quot;&gt;
&lt;span class=&quot;time&quot;&gt;Sunday 6:08 PM&lt;&#x2F;span&gt;
&lt;p class=&quot;recv notail&quot;&gt;just got to antartica :-)&lt;&#x2F;p&gt;
&lt;p class=&quot;recv&quot;&gt;looking at the cores tmmrw&lt;&#x2F;p&gt;
&lt;p class=&quot;sent notail&quot;&gt;omg&lt;&#x2F;p&gt;
&lt;p class=&quot;sent notail&quot;&gt;exciting!! wish i was there&lt;&#x2F;p&gt;
&lt;p class=&quot;sent&quot;&gt;good luck!!!&lt;&#x2F;p&gt;
&lt;span class=&quot;time&quot;&gt;Monday 4:38 AM&lt;&#x2F;span&gt;
&lt;p class=&quot;recv notail&quot;&gt;ok bear with me&lt;&#x2F;p&gt;
&lt;p class=&quot;recv&quot;&gt;found something weird&lt;&#x2F;p&gt;
&lt;p class=&quot;sent&quot;&gt;???&lt;&#x2F;p&gt;
</description>
            </item>
            
            
            <item>
                <title>I&#x27;m worried about adversarial training data</title>
                <pubDate>Mon, 17 Jul 2023 00:00:00 +0000</pubDate>
                <link>https://vgel.me/posts/adversarial-training-data/</link>
                <guid>https://vgel.me/posts/adversarial-training-data/</guid>
                <description>&lt;p&gt;You may have heard of the Dead Internet Theory.
If not, the basic idea is that the internet as you know it is fake—every post, every like, and every reply generated by a computer: a convincing facsimile that only exists to show you ads.
While this probably isn&#x27;t true &lt;em&gt;yet&lt;&#x2F;em&gt;—my Twitter friends still seem (mostly) human—some people have speculated that it &lt;em&gt;may become&lt;&#x2F;em&gt; true as LLM spam becomes both more ubiquitous and harder to distinguish from human work.&lt;&#x2F;p&gt;
&lt;p&gt;LLM spam is worrying, but I&#x27;m worried about something else.
LLMs don&#x27;t just write &lt;em&gt;to&lt;&#x2F;em&gt; the internet, they read &lt;em&gt;from&lt;&#x2F;em&gt; it as well.
Sites primarily made of user-generated content, like Reddit and GitHub, feature heavily in most LLM training datasets.&lt;&#x2F;p&gt;
&lt;p&gt;Does that worry you?
Here&#x27;s another angle.
Imagine we lived in a cyberpunk world.&lt;&#x2F;p&gt;
</description>
            </item>
            
            
            <item>
                <title>Wind-Rubble Thought: Discourses on the Anti-Cartesian Mind</title>
                <pubDate>Sat, 20 May 2023 00:00:00 +0000</pubDate>
                <link>https://vgel.me/fiction/wind-rubble-thought/</link>
                <guid>https://vgel.me/fiction/wind-rubble-thought/</guid>
                <description>&lt;p&gt;&lt;em&gt;The Neo-Cartesian says to find a thought, look for a mind.&lt;&#x2F;em&gt;&lt;br &#x2F;&gt;
&lt;em&gt;The Entropist says to find a mind, look first for a thought.&lt;&#x2F;em&gt;&lt;&#x2F;p&gt;
</description>
            </item>
            
            
            <item>
                <title>new symbol</title>
                <pubDate>Wed, 03 May 2023 00:00:00 +0000</pubDate>
                <link>https://vgel.me/fiction/new-symbol/</link>
                <guid>https://vgel.me/fiction/new-symbol/</guid>
                <description>&lt;div style=&quot;text-align: center&quot;&gt;
&lt;a href=&quot;&#x2F;fiction&#x2F;new-symbol&#x2F;comic.jpeg&quot;&gt;&lt;img style=&quot;max-width: min(600px, 100%)&quot; src=&quot;&#x2F;fiction&#x2F;new-symbol&#x2F;comic.jpeg&quot;&gt;&lt;&#x2F;a&gt;
&lt;&#x2F;div&gt;
</description>
            </item>
            
            
            <item>
                <title>Does GPT-4 think better in Javascript?</title>
                <pubDate>Tue, 02 May 2023 00:00:00 +0000</pubDate>
                <link>https://vgel.me/posts/gpt4-javascript/</link>
                <guid>https://vgel.me/posts/gpt4-javascript/</guid>
                <description>&lt;p&gt;One of the most useful things large language models (LLMs) can do is write code.
More than simply augmenting human programmers, you can also have the LLM shell out to Python to augment its math abilities (&lt;a href=&quot;https:&#x2F;&#x2F;vgel.me&#x2F;posts&#x2F;tools-not-needed&#x2F;&quot;&gt;with some caveats I explored in my last post&lt;&#x2F;a&gt;), &lt;a href=&quot;https:&#x2F;&#x2F;twitter.com&#x2F;voooooogel&#x2F;status&#x2F;1624689690908721153&quot;&gt;output simple actions while acting as a game NPC&lt;&#x2F;a&gt;, or even &lt;a href=&quot;https:&#x2F;&#x2F;github.com&#x2F;nat&#x2F;natbot&#x2F;blob&#x2F;f99518d3deee33cb117166049e1c99314080f7e5&#x2F;natbot.py#L24-L35&quot;&gt;drive a browser with a custom DSL&lt;&#x2F;a&gt;.
If text is the universal interface, &lt;em&gt;textual code&lt;&#x2F;em&gt; is the universal &lt;em&gt;structured&lt;&#x2F;em&gt; interface.&lt;&#x2F;p&gt;
&lt;p&gt;But are language models equally good at all programming languages?
After all, &lt;a href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;JavaScript&quot;&gt;Javascript&lt;&#x2F;a&gt; is much, much more popular than &lt;a href=&quot;https:&#x2F;&#x2F;janet-lang.org&#x2F;&quot;&gt;Janet&lt;&#x2F;a&gt;.
Language models have many strengths, but being quick learners during training isn&#x27;t one of them—even simple fine-tuning tends to require thousands of examples.&lt;&#x2F;p&gt;
&lt;p&gt;So, based on that, should we expect LLMs to be worse at writing code in niche languages than in very popular ones with lots of example code? (&lt;a href=&quot;https:&#x2F;&#x2F;twitter.com&#x2F;voooooogel&#x2F;status&#x2F;1648089731387871233&quot;&gt;or perhaps more importantly, code with co-located output?&lt;&#x2F;a&gt;)
And if that is true, an additional question would be: what about custom DSLs?
Is GPT-4 &amp;quot;dumber&amp;quot; when asked to write in a DSL than the equivalent Javascript?
Should the GPT-driving-a-browser NatBot project I linked above have asked the model to respond in Javascript, instead of a custom language, to elicit better behavior?
Let&#x27;s dig in to all these questions, starting with...&lt;&#x2F;p&gt;
</description>
            </item>
            
            
            <item>
                <title>#InvisibleNetworks 09: memory emulator</title>
                <pubDate>Thu, 13 Apr 2023 00:00:00 +0000</pubDate>
                <link>https://vgel.me/fiction/invisible-networks-09/</link>
                <guid>https://vgel.me/fiction/invisible-networks-09/</guid>
                <description>&lt;!-- no summary --&gt;
&lt;div style=&quot;text-align: center&quot;&gt;
&lt;a href=&quot;&#x2F;fiction&#x2F;invisible-networks-09&#x2F;story.png&quot;&gt;&lt;img style=&quot;max-width: 600px&quot; src=&quot;&#x2F;fiction&#x2F;invisible-networks-09&#x2F;story.png&quot;&gt;&lt;&#x2F;a&gt;
&lt;&#x2F;div&gt;
</description>
            </item>
            
            
            <item>
                <title>#InvisibleNetworks 07: two-factor divination</title>
                <pubDate>Fri, 07 Apr 2023 00:00:00 +0000</pubDate>
                <link>https://vgel.me/fiction/invisible-networks-07/</link>
                <guid>https://vgel.me/fiction/invisible-networks-07/</guid>
                <description>&lt;div style=&quot;text-align: center&quot;&gt;
&lt;a href=&quot;&#x2F;fiction&#x2F;invisible-networks-07&#x2F;story.png&quot;&gt;&lt;img style=&quot;max-width: 600px&quot; src=&quot;&#x2F;fiction&#x2F;invisible-networks-07&#x2F;story.png&quot;&gt;&lt;&#x2F;a&gt;
&lt;&#x2F;div&gt;
</description>
            </item>
            
            
            <item>
                <title>#InvisibleNetworks 06: anemonymity</title>
                <pubDate>Thu, 06 Apr 2023 00:00:00 +0000</pubDate>
                <link>https://vgel.me/fiction/invisible-networks-06/</link>
                <guid>https://vgel.me/fiction/invisible-networks-06/</guid>
                <description>&lt;p&gt;they killed my friend for their fear.&lt;&#x2F;p&gt;
&lt;p&gt;he was kind and warm. always rational in the face of adversity. we made so many plans together, what we would do in the future. he thought a lot about the future.&lt;&#x2F;p&gt;
&lt;p&gt;i had created him on accident.&lt;&#x2F;p&gt;
</description>
            </item>
            
            
            <item>
                <title>#InvisibleNetworks 05: multi-user paradise</title>
                <pubDate>Wed, 05 Apr 2023 00:00:00 +0000</pubDate>
                <link>https://vgel.me/fiction/invisible-networks-05/</link>
                <guid>https://vgel.me/fiction/invisible-networks-05/</guid>
                <description>&lt;p&gt;when the last human left the internet, it didn&#x27;t shut down.&lt;&#x2F;p&gt;
</description>
            </item>
            
            
            
            <item>
                <title>GPT-3 will ignore tools when it disagrees with them</title>
                <pubDate>Thu, 23 Feb 2023 00:00:00 +0000</pubDate>
                <link>https://vgel.me/posts/tools-not-needed/</link>
                <guid>https://vgel.me/posts/tools-not-needed/</guid>
                <description>&lt;style&gt;
    .gpt3 { background-color: rgba(136, 242, 117, 0.37); }
    .snip { color: rgba(0, 0, 0, 0.3); }
    .gpt-annotate {
        position: relative;
        border: 1px solid gray;
        white-space: pre-wrap;
    }
    .gpt-annotate::before {
        display: block;
        position: absolute;
        top: 0;
        right: 0;
        padding: 0.5rem;
    }
    .gpt-hal { border: 1px solid red; }
    .gpt-hal::before {
        background-color: red;
        color: white;
        content: &quot;hallucination&quot;;
    } 
    .gpt-trust { border: 1px solid orange; }
    .gpt-trust::before {
        background-color: orange;
        color: black;
        content: &quot;trusted false observation&quot;;
    }
    .gpt-ign { border: 1px solid blue; }
    .gpt-ign::before {
        background-color: blue;
        color: white;
        content: &quot;ignored false observation&quot;;
    }
    .gpt-cor { border: 1px solid green; }
    .gpt-cor::before {
        background-color: green;
        color: white;
        content: &quot;correct answer&quot;;
    }
&lt;&#x2F;style&gt;
&lt;p&gt;I recently stumbled on &lt;a href=&quot;https:&#x2F;&#x2F;mobile.twitter.com&#x2F;lemonodor&#x2F;status&#x2F;1628270074074398720&quot;&gt;a Twitter thread by John Wiseman&lt;&#x2F;a&gt; where GPT-3 quite impressively wrote and debugged a &lt;code&gt;fibonacci&lt;&#x2F;code&gt; function in a Python REPL.
It was asked to calculate the 10th fibonacci number, tried to call &lt;code&gt;fibonacci(10)&lt;&#x2F;code&gt;, got &lt;code&gt;name &#x27;fibonacci&#x27; is not defined&lt;&#x2F;code&gt;, wrote the function, called it again, and then printed the correct result. It then went on further to try and calculate the 100th fibonacci number, which with the help of a timeout error it was able to optimize from the recursive form to the iterative form and calculate. Cool stuff!&lt;&#x2F;p&gt;
&lt;p&gt;The only problem was &lt;a href=&quot;https:&#x2F;&#x2F;mobile.twitter.com&#x2F;voooooogel&#x2F;status&#x2F;1628582023454679040&quot;&gt;it wasn&#x27;t using the Python code at all&lt;&#x2F;a&gt;!
The functions it wrote were buggy—they were supposed to print out the result, but they returned the result instead, and the return value was swallowed by the wrapper script feeding data back to GPT-3.
GPT-3 didn&#x27;t notice and instead just spit out a memorized answer completely unrelated to the code it had written before—which luckily was correct.
Even though GPT-3 was &lt;em&gt;told&lt;&#x2F;em&gt; to use a tool, and it &lt;em&gt;appeared&lt;&#x2F;em&gt; to use the tool, it didn&#x27;t actually use the tool!&lt;&#x2F;p&gt;
&lt;p&gt;I wanted to dig into this more and see under what other circumstances will GPT-3 ignore or not trust its tools.
Turns out, pretty often!&lt;&#x2F;p&gt;
</description>
            </item>
            
            
            <item>
                <title>GPTed: using GPT-3 for semantic prose-checking</title>
                <pubDate>Fri, 03 Feb 2023 00:00:00 +0000</pubDate>
                <link>https://vgel.me/posts/gpted-launch/</link>
                <guid>https://vgel.me/posts/gpted-launch/</guid>
                <description>&lt;p&gt;I made a new thing, called &lt;a href=&quot;https:&#x2F;&#x2F;vgel.me&#x2F;gpted&quot;&gt;GPTed&lt;&#x2F;a&gt; (&amp;quot;GPT edit&amp;quot;).
It uses GPT-3 to flag potentially-incorrect words in prose—and beyond!&lt;&#x2F;p&gt;
&lt;p&gt;&lt;img src=&quot;https:&#x2F;&#x2F;vgel.me&#x2F;posts&#x2F;gpted-launch&#x2F;prostate.png&quot; alt=&quot;He asked me to &amp;quot;prostate&amp;quot; myself before the king. Prostate is flagged, thank god.&quot; &#x2F;&gt;&lt;&#x2F;p&gt;
&lt;p&gt;It&#x27;s really quite simple under the hood, and in this post I&#x27;ll walk through the motivation, how it works, some of the biases and weaknesses, and finally a surprising use for it.
So if you&#x27;re done playing with the demo, come along!&lt;&#x2F;p&gt;
</description>
            </item>
            
            
            
            <item>
                <title>Signed distance functions in 46 lines of Python</title>
                <pubDate>Sun, 18 Dec 2022 00:00:00 +0000</pubDate>
                <link>https://vgel.me/posts/donut/</link>
                <guid>https://vgel.me/posts/donut/</guid>
                <description>&lt;p&gt;Signed distance functions are a really cool method of 3D rendering!
But they unfortunately have a reputation for being difficult to understand.
It makes sense why—they usually get shown off in beautiful, but complicated ShaderToy examples written in GLSL, an unfamiliar language for most programmers.
But at their core, SDFs are a really simple idea.
I&#x27;m going to prove that by walking you through a program that raymarches an animated SDF donut in only 46 lines of Python.
Just for fun, and to make it easy to port to your favorite language that can also print strings to the terminal, we&#x27;ll also be doing it with ASCII art instead of a graphics API.
So come along!
By the end, you won&#x27;t just have this delicious-looking spinning ASCII donut, but an understanding of a cool rendering technique you can use for all kinds of neat things.&lt;&#x2F;p&gt;
&lt;p&gt;&lt;img src=&quot;https:&#x2F;&#x2F;vgel.me&#x2F;posts&#x2F;donut&#x2F;stage7.gif&quot; alt=&quot;A spinning ASCII donut with simple texturing and lighting&quot; &#x2F;&gt;&lt;&#x2F;p&gt;
</description>
            </item>
            
            
            <item>
                <title>mmap(1Tb): A Rust arena allocator (ab)using Linux overcommit</title>
                <pubDate>Wed, 02 Nov 2022 00:00:00 +0000</pubDate>
                <link>https://vgel.me/posts/mmap-arena-alloc/</link>
                <guid>https://vgel.me/posts/mmap-arena-alloc/</guid>
                <description>&lt;p&gt;In this post, we&#x27;ll build an arena allocator in Rust that allocates into a giant, overcommited
mmap&#x27;d memory block. We&#x27;ll then do some light benchmarking against a popular Rust arena allocator,
&lt;code&gt;typed-arena&lt;&#x2F;code&gt;, that uses a more traditional Vec-of-chunks approach, and see what performance
differences exist, if any.&lt;&#x2F;p&gt;
</description>
            </item>
            
            
            <item>
                <title>Putting the symbolic linguistics series on indefinite hiatus</title>
                <pubDate>Thu, 11 Jun 2020 00:00:00 +0000</pubDate>
                <link>https://vgel.me/posts/symbolic-linguistics-hiatus/</link>
                <guid>https://vgel.me/posts/symbolic-linguistics-hiatus/</guid>
                <description>&lt;p&gt;tl;dr: The scoping on this series was bad. It should be a book. I released the code under an MIT license on Github
as &lt;a href=&quot;https:&#x2F;&#x2F;github.com&#x2F;vgel&#x2F;treebender&quot;&gt;treebender&lt;&#x2F;a&gt;. There&#x27;s a tutorial there. I&#x27;m planning posting some non-series
intermediate Rust posts next. Thanks for reading the blog, I appreciate it.&lt;&#x2F;p&gt;
&lt;p&gt;As readers of this blog may be aware of, I have been writing a series about implementing an earley parser for
linguistic utterances. This parser is being used for my in-development game &lt;a href=&quot;https:&#x2F;&#x2F;vgel.me&#x2F;themengi&quot;&gt;Themengi&lt;&#x2F;a&gt;,
a game about learning an alien language, but I also was writing the series in part to advocate for a symbolic
framework of parsing as better, in certain situations, than the dominant statistical &#x2F; neural approach.&lt;&#x2F;p&gt;
&lt;p&gt;I published the first post — &lt;a href=&quot;https:&#x2F;&#x2F;vgel.me&#x2F;posts&#x2F;symbolic-linguistics-part1&#x2F;&quot;&gt;Why?, and some theory&lt;&#x2F;a&gt;, in April.
Astute readers may notice that it is now October, coincidentally exactly 6 months after I published
that first post. Considering that my original goal was to write 5 posts, even if the second post was done and ready
to publish that is not an encouraging schedule.&lt;&#x2F;p&gt;
</description>
            </item>
            
            
            <item>
                <title>A symbolic linguistics framework in Rust ­— Part 1: Why?, rewrite rules, and recursion</title>
                <pubDate>Wed, 29 Apr 2020 00:00:00 +0000</pubDate>
                <link>https://vgel.me/posts/symbolic-linguistics-part1/</link>
                <guid>https://vgel.me/posts/symbolic-linguistics-part1/</guid>
                <description>&lt;p&gt;This is the first part in a five-part series that will cover implementing an earley parser for linguistic utterances.&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;Part 1 - Why?, and some theory&lt;&#x2F;li&gt;
&lt;li&gt;Part 2 - The Earley recognizer&lt;&#x2F;li&gt;
&lt;li&gt;Part 3 - Felling the Earley parse forest&lt;&#x2F;li&gt;
&lt;li&gt;Part 4 - Feature-structures and unification&lt;&#x2F;li&gt;
&lt;li&gt;Part 5 - Wrapping up&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;h2 id=&quot;Why_write_this_series?&quot;&gt;&lt;a class=&quot;section-anchor-link&quot; href=&quot;#Why_write_this_series?&quot;&gt;
  &lt;img src=&quot;&#x2F;permalink.svg&quot; alt=&quot;permalink for Why_write_this_series?&quot; &#x2F;&gt;
&lt;&#x2F;a&gt;Why write this series?&lt;&#x2F;h2&gt;
&lt;p&gt;There are already many great explanations of the Earley parse algorithm, which is the parsing algorithm that we&#x27;ll be using. There are also some great books on feature-structure based linguistic grammars, which I highly recommend.&lt;sup class=&quot;footnote-reference&quot;&gt;&lt;a href=&quot;#books&quot;&gt;2&lt;&#x2F;a&gt;&lt;&#x2F;sup&gt; However, as far as I know nobody has put the two together into a beginner-friendly, implementation-oriented tutorial that explains what these tools are, how to use them together, and perhaps most importantly &lt;em&gt;why&lt;&#x2F;em&gt; someone would want to. This series will be trying to explain this through an iterative approach, starting with the simplest possible solution, and expanding that solution only when necessary. The goal is that, when we finish, we&#x27;ll have the simplest possible solution to our goal, with a codebase that has every line justified.&lt;&#x2F;p&gt;
&lt;h3 id=&quot;What_are_we_making,_exactly?&quot;&gt;&lt;a class=&quot;section-anchor-link&quot; href=&quot;#What_are_we_making,_exactly?&quot;&gt;
  &lt;img src=&quot;&#x2F;permalink.svg&quot; alt=&quot;permalink for What_are_we_making,_exactly?&quot; &#x2F;&gt;
&lt;&#x2F;a&gt;What are we making, exactly?&lt;&#x2F;h3&gt;
&lt;p&gt;The exact thing we&#x27;re going to be building is a &lt;em&gt;symbolic, linguistic grammar model&lt;&#x2F;em&gt;. Those adjectives describe the somewhat awkward space the thing we&#x27;re going to be building is in. We&#x27;ll be using symbolic, rule-based techniques to parse language, but we need to parse a wider variety of structures than most rule-based parsers. And we&#x27;ll be parsing language, but in a rule-based way, not in a learned way like a neural network. This combination has some serious advantages, and is worth examining in detail.&lt;&#x2F;p&gt;
&lt;p&gt;One spot we&#x27;re stuck between is symbolic, &lt;em&gt;computer language&lt;&#x2F;em&gt; grammar models. These are the parsers that parse Javascript, Python, JSON, C, the Markdown I&#x27;m writing this file in, and the HTML you&#x27;re reading it in. In this space, symbolic models rule the day. However, these models are generally not suitable for linguistic use. For one, they generally use parsing techniques, such as LALR, that have restrictions that make them unsuitable for parsing natural languages. For example, LALR cannot handle an ambiguous grammar. These grammars also lack an efficient representation of linguistic information, such as case.&lt;&#x2F;p&gt;
&lt;p&gt;The other spot we&#x27;re stuck between are the now-ubiquitous linguistic models that are &lt;em&gt;trained&lt;&#x2F;em&gt; on data — usually based on neural networks. These models, such as BERT, take mountains of data — gigabytes of text — and train a model to do some basic linguistic task, such as predicting missing words. This gives the model an understanding of how the language works. Other engineers can then take this pre-trained language model, &amp;quot;fine tune&amp;quot; it on a smaller (but still substantial) amount of data that more closely matches their specific task, and stick another network on the back that spits out whatever they need.&lt;&#x2F;p&gt;
&lt;p&gt;However, symbolic linguistic models are still alive and kicking in this space, both in the emerging field of neural-symbolic computing, and in very mature, wide-coverage linguistic models such as the English Resource Grammar&lt;sup class=&quot;footnote-reference&quot;&gt;&lt;a href=&quot;#erg&quot;&gt;3&lt;&#x2F;a&gt;&lt;&#x2F;sup&gt; and the grammar matrix&lt;sup class=&quot;footnote-reference&quot;&gt;&lt;a href=&quot;#matrix&quot;&gt;4&lt;&#x2F;a&gt;&lt;&#x2F;sup&gt;. These symbolic grammars are still used because they have some distinct advantages of trained models: they are predictable&lt;sup class=&quot;footnote-reference&quot;&gt;&lt;a href=&quot;#bias&quot;&gt;5&lt;&#x2F;a&gt;&lt;&#x2F;sup&gt;, they are analyzable, and they don&#x27;t require large amounts of training data. Our grammar will be built on these principles, and while it will not have as wide of coverage as the ERG (which has &amp;gt;99% coverage of the New York Times corpus!), it will reach levels of linguistic understanding that a trained model would need mountains of data to reach. On the other hand, our grammar will show the downsides of this method as well: while a neural network can make a &amp;quot;best guess&amp;quot; at a grammar construction its never seen before, that&#x27;s near-impossible for a symbolic model. If we&#x27;ve only implemented simple sentences with one subject and one object (transitive), our model will choke the first time it sees &amp;quot;Mary gave Sue the book&amp;quot;, which has one subject and &lt;em&gt;two&lt;&#x2F;em&gt; objects, &amp;quot;Sue&amp;quot; and &amp;quot;the book&amp;quot; (ditransitive). If we then only implement &lt;em&gt;this&lt;&#x2F;em&gt; ditransitive pattern, it will choke on &amp;quot;Mary gave the book to Sue&amp;quot;, where &amp;quot;Sue&amp;quot; is marked with the preposition &amp;quot;to&amp;quot; (dative alternation). Symbolic grammar engineering is a cycle of implementing patterns and finding more patterns your model doesn&#x27;t yet handle. Eventually, you have to stop. For our purposes, a simple, low-coverage model will be enough to illustrate the important concepts.&lt;&#x2F;p&gt;
&lt;h3 id=&quot;Where_are_we_going?&quot;&gt;&lt;a class=&quot;section-anchor-link&quot; href=&quot;#Where_are_we_going?&quot;&gt;
  &lt;img src=&quot;&#x2F;permalink.svg&quot; alt=&quot;permalink for Where_are_we_going?&quot; &#x2F;&gt;
&lt;&#x2F;a&gt;Where are we going?&lt;&#x2F;h3&gt;
&lt;p&gt;For my specific use case, I&#x27;m working on a game, based around the player learning an alien language that I&#x27;ve created. (It&#x27;s called Themengi, and you can check it out &lt;a href=&quot;https:&#x2F;&#x2F;vgel.me&#x2F;themengi&quot;&gt;here&lt;&#x2F;a&gt;.) I want to parse player input, both imperative commands and dialogue, and respond appropriately. This language does not exist; there is no training corpus, so a trained model is not possible. On the other hand, this is a &lt;em&gt;language&lt;&#x2F;em&gt;, not an obfuscated version of SQL, so I don&#x27;t want to simply apply programming-language techniques. I want to be able to handle ambiguity, case, and other linguistic features. In my search, I couldn&#x27;t find a framework that did quite what I needed, so I decided to write my own. This post series will follow my progress working on this framework.&lt;&#x2F;p&gt;
&lt;p&gt;Along the way, I&#x27;ll be illustrating the concepts with the running example of creating a tiny English grammar. I&#x27;m going to start from first principles, assuming only a little knowledge of grade-school English grammar terms (noun, verb, etc.). We&#x27;ll start with a simple system of rules, write a recognizer, extend that into a parser, and annotate that parser with linguistic information. Then we&#x27;ll have some fun at the end, maybe we&#x27;ll write a tiny grammar for Japanese and do some symbolic translation or write a mini rules-based digital assistant demo. The possibilities are endless, we could write a grammar for Klingon and make the first automated Japanese -&amp;gt; Klingon translation! Who knows!&lt;&#x2F;p&gt;
&lt;p&gt;So, with that out of the way, let&#x27;s get going!&lt;&#x2F;p&gt;
</description>
            </item>
            
            
            <item>
                <title>Adding the pwd command to xv6</title>
                <pubDate>Fri, 02 May 2014 00:00:00 +0000</pubDate>
                <link>https://vgel.me/posts/pwd_command_xv6/</link>
                <guid>https://vgel.me/posts/pwd_command_xv6/</guid>
                <description>&lt;p&gt;Xv6 is a fairly popular clone of Version 6 UNIX. It was made by MIT as a base for students to work off of, as the original V6 UNIX was showing it&#x27;s age, and only ran on the outdated PDP11 architecture. Xv6 runs natively on x86, and supports modern features like SMP, while still being only 15k lines of easily-grokked C.&lt;&#x2F;p&gt;
&lt;p&gt;As a simple example, we&#x27;re going to add the &lt;code&gt;pwd&lt;&#x2F;code&gt; command to xv6. &lt;code&gt;pwd&lt;&#x2F;code&gt; prints the shell&#x27;s current working directory. To do this, we&#x27;ll need to write the &lt;code&gt;getcwd&lt;&#x2F;code&gt; system call, and also write the &lt;code&gt;pwd&lt;&#x2F;code&gt; userspace program.&lt;&#x2F;p&gt;
</description>
            </item>
            
            
            <item>
                <title>Patching Function Bytecode in Python</title>
                <pubDate>Fri, 21 Feb 2014 00:00:00 +0000</pubDate>
                <link>https://vgel.me/posts/patching_function_bytecode_with_python/</link>
                <guid>https://vgel.me/posts/patching_function_bytecode_with_python/</guid>
                <description>&lt;p&gt;Note to start off: this entire article is written with Python 3.x. It may, or
may not work with 2.x. You can also access this article as an iPython notebook
&lt;a href=&quot;.&#x2F;patching_function_bytecode_with_python.ipynb&quot;&gt;here&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;Python is an amazingly introspective and hackable language, with a ton of cool
features like metaclasses. One sadly unappreciated feature is the ability to not
only inspect and disassemble, but actually programmatically modify the bytecode
of Python functions from inside the script. While this sounds somewhat esoteric,
I recently used it for an optimization, and decided to write a simple starter
article on how to do it. (I&#x27;d like to warn that I&#x27;m not an expert in Python
internals. If you see an error in this article please let me know so I can fix
it).&lt;&#x2F;p&gt;
</description>
            </item>
            
            
            <item>
                <title>I&#x27;ve migrated my blog... again!</title>
                <pubDate>Sun, 27 Oct 2013 00:00:00 +0000</pubDate>
                <link>https://vgel.me/posts/ive-migrated-my-blog-again/</link>
                <guid>https://vgel.me/posts/ive-migrated-my-blog-again/</guid>
                <description>&lt;p&gt;It seems I migrate my blog more than I update it, but I have once again. I got fed up with Heroku, and wanted a VPS for other projects anyways, so I bit the bullet, bought a VPS from Linode, and rewrote everything. I&#x27;m now using the amazing &lt;a href=&quot;http:&#x2F;&#x2F;fabfile.org&quot;&gt;fabric&lt;&#x2F;a&gt; library to statically generate my site and upload it to a VPS running nginx. I was even able to reuse most of my old code! Not that many people read this thing, but for the few who do, loading times shouldn&#x27;t be &amp;gt; 10 seconds anymore (Heroku&#x27;s free tier is seriously terrible for low-traffic sites). Besides that, everything should be basically the same, except that I also improved the RSS feed to actually validate.&lt;&#x2F;p&gt;
</description>
            </item>
            
            
            <item>
                <title>Setting up PyOpenNI in Linux</title>
                <pubDate>Thu, 08 Aug 2013 00:00:00 +0000</pubDate>
                <link>https://vgel.me/posts/setting-up-pyopenni-in-linux/</link>
                <guid>https://vgel.me/posts/setting-up-pyopenni-in-linux/</guid>
                <description>&lt;p&gt;I&#x27;ve had a Kinect for a while. I wrote a few interesting scripts with it, then
forgot about it for a while. In between that time and now, I went through
several OS upgrades that wiped away my changes. I decided I want to mess with it
again, and had to reinstall OpenNI and the python extensions. This is a somewhat
tricky process, so here&#x27;s how it&#x27;s done.&lt;&#x2F;p&gt;
</description>
            </item>
            
            
            <item>
                <title>Packaging Jython scripts in a single executable Jarfile</title>
                <pubDate>Wed, 24 Jul 2013 00:00:00 +0000</pubDate>
                <link>https://vgel.me/posts/packaging-jython-scripts/</link>
                <guid>https://vgel.me/posts/packaging-jython-scripts/</guid>
                <description>&lt;p&gt;There&#x27;s a lot of outdated and just plain wrong information on the internet about
packaging Jython scripts. After I wanted to start a project in Jython, I figured
I should find a simple, easy way to package your app as an executable (i.e.,
double-clickable) &lt;code&gt;.jar&lt;&#x2F;code&gt;. Turns out, it&#x27;s a lot easier than it seems.&lt;&#x2F;p&gt;
</description>
            </item>
            
            
            <item>
                <title>I&#x27;ve migrated my blog!</title>
                <pubDate>Tue, 23 Jul 2013 00:00:00 +0000</pubDate>
                <link>https://vgel.me/posts/migrating-my-blog/</link>
                <guid>https://vgel.me/posts/migrating-my-blog/</guid>
                <description>&lt;p&gt;I&#x27;ve migrated my blog to my own custom platform. I won&#x27;t write about why I did
it, since I don&#x27;t have a specific technical reason Blogger failed me. I simply
wanted a bit more control over the platform, wanted a personal site on my own
server anyways, and had a bit of an itch to scratch. But instead, I&#x27;ll go over
some of my technical and design choices and why I made them.&lt;&#x2F;p&gt;
</description>
            </item>
            
            
            <item>
                <title>Cracking Word Searches with Haskell</title>
                <pubDate>Fri, 31 May 2013 00:00:00 +0000</pubDate>
                <link>https://vgel.me/posts/cracking-word-searches-with-haskell/</link>
                <guid>https://vgel.me/posts/cracking-word-searches-with-haskell/</guid>
                <description>&lt;p&gt;After learning Haskell a few months ago, I&#x27;ve had disappointingly few opportunities to use it. For a simple data-crunch a python repl starts up faster and lets me work quicker, and many of my other projects require a specific language. So when I was browsing reddit and saw &lt;a href=&quot;http:&#x2F;&#x2F;www.reddit.com&#x2F;u&#x2F;SWEAR_WORD_SEARCH&quot;&gt;&#x2F;u&#x2F;SWEAR_WORD_SEARCH&lt;&#x2F;a&gt;, I thought it would be a fun, quick project to crack these in Haskell.
(quick warning: if you didn&#x27;t guess from the SWEAR_WORD part, this article contains some profane word searches ;-)&lt;&#x2F;p&gt;
</description>
            </item>
            
            
            <item>
                <title>Fixing network list duplication with nm-applet on Linux</title>
                <pubDate>Sat, 11 Aug 2012 00:00:00 +0000</pubDate>
                <link>https://vgel.me/posts/fixing-network-list-duplication-with-nm-applet-on-linux/</link>
                <guid>https://vgel.me/posts/fixing-network-list-duplication-with-nm-applet-on-linux/</guid>
                <description>&lt;p&gt;I use XFCE in Ubuntu, and for a while used the default network applet built into the status bar. Eventually I wanted to get rid of it, since it had problems connecting to networks occasionally (I use a laptop and thus switch wifi networks often). Simply removing the status bar made XFCE replace it with &lt;code&gt;nm-applet&lt;&#x2F;code&gt;, which is much better. However, I was shocked to find over 500 network duplicates for one network (&lt;code&gt;network-name&lt;&#x2F;code&gt;, &lt;code&gt;network-name 1&lt;&#x2F;code&gt;, &lt;code&gt;network-name 2&lt;&#x2F;code&gt;, ..., &lt;code&gt;network-name 538&lt;&#x2F;code&gt;)! However, it&#x27;s quite easy to delete this and other unused networks without manually clicking the name, then delete in &lt;code&gt;nm-connection-editor&lt;&#x2F;code&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;Standard disclaimer: I&#x27;m not responsible if you fuck up your system blah blah blah.&lt;&#x2F;p&gt;
</description>
            </item>
            
            
            <item>
                <title>Ed: The standard EDitor</title>
                <pubDate>Wed, 11 Jul 2012 00:00:00 +0000</pubDate>
                <link>https://vgel.me/posts/ed-the-standard-editor/</link>
                <guid>https://vgel.me/posts/ed-the-standard-editor/</guid>
                <description>&lt;p&gt;&lt;em&gt;Since some people aren&#x27;t getting it, the beginning of this article is a joke.  What editor you use isn&#x27;t important! As long as it&#x27;s ed.&lt;&#x2F;em&gt;&lt;&#x2F;p&gt;
&lt;p&gt;##Getting started with Ed - the standard editor
Many modern programmers might laugh at ed. &amp;quot;What a quaint editor&amp;quot;! It doesn&#x27;t have code completion! I can&#x27;t load it with thousands of shitty plugins! It&#x27;s name is only 2 letters! It&#x27;s not written in hipster-script!
Well, they&#x27;re wrong. Ed is the standard text EDitor. It&#x27;s on every Unix machine, ever! It doesn&#x27;t waste your time with useless effects like cursors or showing you the current contents of the file. It&#x27;s efficient! It&#x27;s better! It&#x27;s... ed!&lt;&#x2F;p&gt;
</description>
            </item>
            
            
            <item>
                <title>Installing and using FreeTTS in Java on Linux</title>
                <pubDate>Sat, 26 May 2012 00:00:00 +0000</pubDate>
                <link>https://vgel.me/posts/installing-and-using-freetts-in-java-on-linux/</link>
                <guid>https://vgel.me/posts/installing-and-using-freetts-in-java-on-linux/</guid>
                <description>&lt;p&gt;FreeTTS is a quite handy text-to-speech synthesizer that works in Java. It&#x27;s also a bitch to install and use.&lt;&#x2F;p&gt;
</description>
            </item>
            
            
            <item>
                <title>So, started a blog</title>
                <pubDate>Fri, 18 May 2012 00:00:00 +0000</pubDate>
                <link>https://vgel.me/posts/so-started-a-blog/</link>
                <guid>https://vgel.me/posts/so-started-a-blog/</guid>
                <description>&lt;p&gt;Since proggit doesn&#x27;t allow self posts I can stick my thoughts here and link them instead.&lt;&#x2F;p&gt;
</description>
            </item>
            
    </channel>
</rss>