Skip to content

Move base64 algorithms to Infra #2920

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 15, 2017
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
198 changes: 11 additions & 187 deletions source
Original file line number Diff line number Diff line change
Expand Up @@ -143,24 +143,6 @@

.parse-error-table td > p:first-child { margin-top: 0; }

#base64-table {
white-space: nowrap;
font-size: 0.6em;
column-width: 6em;
column-count: 5;
column-gap: 1em;
-moz-column-width: 6em;
-moz-column-count: 5;
-moz-column-gap: 1em;
-webkit-column-width: 6em;
-webkit-column-count: 5;
-webkit-column-gap: 1em;
}
#base64-table thead { display: none; }
#base64-table * { border: none; }
#base64-table tbody td:first-child:after { content: ':'; }
#base64-table tbody td:last-child { text-align: right; }

#named-character-references-table {
white-space: nowrap;
font-size: 0.6em;
Expand Down Expand Up @@ -2450,6 +2432,8 @@ a.setAttribute('href', 'https://example.com/'); // change the content attribute
<dfn data-x="set append" data-x-href="https://infra.spec.whatwg.org/#set-append">append</dfn></li>
<li>The <dfn data-x-href="https://infra.spec.whatwg.org/#struct">struct</dfn> specification type and the associated definition for
<dfn data-x="struct item" data-x-href="https://infra.spec.whatwg.org/#struct-item">item</dfn></li>
<li>The <dfn data-x-href="https://infra.spec.whatwg.org/#forgiving-base64-encode">forgiving-base64 encode</dfn> and
<dfn data-x-href="https://infra.spec.whatwg.org/#forgiving-base64-decode">forgiving-base64 decode</dfn> algorithms</li>
<li><dfn data-x-href="https://infra.spec.whatwg.org/#html-namespace">HTML namespace</dfn></li>
<li><dfn data-x-href="https://infra.spec.whatwg.org/#mathml-namespace">MathML namespace</dfn></li>
<li><dfn data-x-href="https://infra.spec.whatwg.org/#svg-namespace">SVG namespace</dfn></li>
Expand Down Expand Up @@ -90025,7 +90009,7 @@ interface <dfn>WindowOrWorkerGlobalScope</dfn> {

// base64 utility methods
DOMString <span data-x="dom-btoa">btoa</span>(DOMString data);
DOMString <span data-x="dom-atob">atob</span>(DOMString data);
ByteString <span data-x="dom-atob">atob</span>(DOMString data);

// timers
long <span data-x="dom-setTimeout">setTimeout</span>(<span>TimerHandler</span> handler, optional long timeout = 0, any... arguments);
Expand Down Expand Up @@ -90114,180 +90098,23 @@ document.body.appendChild(frame)</pre>
<p>The <dfn data-x="dom-btoa"><code id="dom-windowbase64-btoa">btoa(<var>data</var>)</code></dfn>
method must throw an <span>"<code>InvalidCharacterError</code>"</span> <code>DOMException</code>
if <var>data</var> contains any character whose code point is greater than U+00FF. Otherwise, the
user agent must convert <var>data</var> to a sequence of octets whose <var>n</var>th octet is the
eight-bit representation of the code point of the <var>n</var>th character of <var>data</var>, and
then must apply the base64 algorithm to that sequence of octets, and return the result. <ref
spec=RFC4648><!--base64--></p>
<!-- Aryeh says: This seems to be what all browsers do as of January 2011 (except IE, which
doesn't support these functions at all). -->

user agent must convert <var>data</var> to a byte sequence whose <var>n</var>th byte is the
eight-bit representation of the <var>n</var>th code point of <var>data</var>, and then must apply
<span>forgiving-base64 encode</span> to that byte sequence and return the result.</p>

<p>The <dfn data-x="dom-atob"><code id="dom-windowbase64-atob">atob(<var>data</var>)</code></dfn>
method, when invoked, must run the following steps:</p>

<ol>
<li><p>Let <var>decodedData</var> be the result of running <span>forgiving-base64 decode</span>
on <var>data</var>.</p></li>

<!-- Aryeh says: Copies Firefox behavior as of January 2011 (4.0b8). WebKit is somewhat laxer,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we preserve this info, at least as comments, in the Infra copy? The mailing list link seems especially interesting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I didn't is that all browsers pass the atob() tests, which tests all of this. So it would only be useful if we expect someone to try to remove that later on, but I guess it doesn't hurt either. I'll add that link back.

and Opera throws no exceptions at all. gsnedders reports Opera's behavior causes site-compat
problems, and I figure most sites depend on Firefox if on anything, so go with that. -->

<!-- Since updated to drop whitespace, based on the arguments here:
https://lists.w3.org/Archives/Public/public-whatwg-archive/2011May/0207.html
-->

<li><p>Let <var>position</var> be a pointer into <var>data</var>, initially
pointing at the start of the string.</p></li>

<li><p>Remove all <span>ASCII whitespace</span> from <var>data</var>.</p></li>

<li><p>If the length of <var>data</var> divides by 4 leaving no remainder, then: if
<var>data</var> ends with one or two U+003D EQUALS SIGN (=) characters, remove them from
<var>data</var>.</p></li>

<li><p>If the length of <var>data</var> divides by 4 leaving a remainder of 1, throw an
<span>"<code>InvalidCharacterError</code>"</span> <code>DOMException</code> and abort these
steps.</p>

<li>

<p>If <var>data</var> contains a character that is not in the following list of
characters and character ranges, throw an <span>"<code>InvalidCharacterError</code>"</span>
<code>DOMException</code> and abort these steps:</p>

<ul class="brief">
<li>U+002B PLUS SIGN (+)
<li>U+002F SOLIDUS (/)
<li><span>ASCII alphanumeric</span>
</ul>

</li>

<li><p>Let <var>output</var> be a string, initially empty.</p></li>

<li><p>Let <var>buffer</var> be a buffer that can have bits appended to it, initially
empty.</p></li>

<li>

<p>While <var>position</var> does not point past the end of <var>data</var>, run these
substeps:</p>

<ol>

<li>

<p>Find the character pointed to by <var>position</var> in the first column of the
following table. Let <var>n</var> be the number given in the second cell of the same
row.</p>

<div id="base64-table">
<table>
<thead>
<tr>
<th>Character
<th>Number
<tbody>
<tr><td>A<td>0
<tr><td>B<td>1
<tr><td>C<td>2
<tr><td>D<td>3
<tr><td>E<td>4
<tr><td>F<td>5
<tr><td>G<td>6
<tr><td>H<td>7
<tr><td>I<td>8
<tr><td>J<td>9
<tr><td>K<td>10
<tr><td>L<td>11
<tr><td>M<td>12
<tr><td>N<td>13
<tr><td>O<td>14
<tr><td>P<td>15
<tr><td>Q<td>16
<tr><td>R<td>17
<tr><td>S<td>18
<tr><td>T<td>19
<tr><td>U<td>20
<tr><td>V<td>21
<tr><td>W<td>22
<tr><td>X<td>23
<tr><td>Y<td>24
<tr><td>Z<td>25
<tr><td>a<td>26
<tr><td>b<td>27
<tr><td>c<td>28
<tr><td>d<td>29
<tr><td>e<td>30
<tr><td>f<td>31
<tr><td>g<td>32
<tr><td>h<td>33
<tr><td>i<td>34
<tr><td>j<td>35
<tr><td>k<td>36
<tr><td>l<td>37
<tr><td>m<td>38
<tr><td>n<td>39
<tr><td>o<td>40
<tr><td>p<td>41
<tr><td>q<td>42
<tr><td>r<td>43
<tr><td>s<td>44
<tr><td>t<td>45
<tr><td>u<td>46
<tr><td>v<td>47
<tr><td>w<td>48
<tr><td>x<td>49
<tr><td>y<td>50
<tr><td>z<td>51
<tr><td>0<td>52
<tr><td>1<td>53
<tr><td>2<td>54
<tr><td>3<td>55
<tr><td>4<td>56
<tr><td>5<td>57
<tr><td>6<td>58
<tr><td>7<td>59
<tr><td>8<td>60
<tr><td>9<td>61
<tr><td>+<td>62
<tr><td>/<td>63
</table>
</div>

</li>

<li><p>Append to <var>buffer</var> the six bits corresponding to <var>number</var>, most significant bit first.</p></li>

<li><p>If <var>buffer</var> has accumulated 24 bits, interpret them as three 8-bit
big-endian numbers. Append the three characters with code points equal to those numbers to <var>output</var>, in the same order, and then empty <var>buffer</var>.</p></li>

<li><p>Advance <var>position</var> by one character.</p></li>

</ol>

</li>

<li>

<p>If <var>buffer</var> is not empty, it contains either 12 or 18 bits. If it contains
12 bits, discard the last four and interpret the remaining eight as an 8-bit big-endian number.
If it contains 18 bits, discard the last two and interpret the remaining 16 as two 8-bit
big-endian numbers. Append the one or two characters with code points equal to those one or two
numbers to <var>output</var>, in the same order.</p>

<p class="note">The discarded bits mean that, for instance, <code data-x="">atob("YQ")</code> and
<code data-x="">atob("YR")</code> both return "<code data-x="">a</code>".</p>

</li>

<li><p>Return <var>output</var>.</p></li>
<li><p>If <var>decodedData</var> is failure, then throw an
<span>"<code>InvalidCharacterError</code>"</span> <code>DOMException</code>.</p></li>

<li><p>Return <var>decodedData</var>.</p></li>
</ol>

<!-- Note: this function is defined explicitly here because RFC4648 does not specify how to handle
erroneous input, and no preexisting browser implementation simply throws an exception on all
erroneous input. -->

</div>


Expand Down Expand Up @@ -120244,9 +120071,6 @@ INSERT INTERFACES HERE
<dt id="refsRFC7595">[RFC7595]</dt>
<dd><cite><a href="https://tools.ietf.org/html/rfc7595">Guidelines and Registration Procedures for URI Schemes</a></cite>, D. Thaler, T. Hansen, T. Hardie. IETF.</dd>

<dt id="refsRFC4648">[RFC4648]</dt>
<dd><cite><a href="https://tools.ietf.org/html/rfc4648">The Base16, Base32, and Base64 Data Encodings</a></cite>, S. Josefsson. IETF.</dd>

<dt id="refsRFC5322">[RFC5322]</dt>
<dd><cite><a href="https://tools.ietf.org/html/rfc5322">Internet Message Format</a></cite>, P. Resnick. IETF.</dd>

Expand Down