-
Notifications
You must be signed in to change notification settings - Fork 16
Expand file tree
/
Copy pathdraft-toomim-httpbis-versions-05.txt
More file actions
1790 lines (1221 loc) · 59.2 KB
/
draft-toomim-httpbis-versions-05.txt
File metadata and controls
1790 lines (1221 loc) · 59.2 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Internet-Draft M. Toomim
Expires: Aug 20, 2026 Invisible College
Intended status: Proposed Standard Mar 02, 2026
HTTP Resource Versioning
draft-toomim-httpbis-versions-05
Abstract
HTTP resources change over time. Each change to a resource creates a
new "version" of its state. HTTP systems often need a way to
identify, read, write, navigate, and/or merge these versions, in
order to implement cache consistency, create history archives, settle
race conditions, request incremental updates to resources, interpret
incremental updates to versions, or implement distributed
collaborative editing algorithms.
This document analyzes existing methods of versioning in HTTP,
highlights limitations, and specifies a more general versioning
approach that can enable new use-cases for HTTP. An upgrade path for
legacy intermediaries is provided.
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as
Internet-Drafts. The list of current Internet-Drafts is at
https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
https://www.ietf.org/1id-abstracts.html
The list of Internet-Draft Shadow Directories can be accessed at
https://www.ietf.org/shadow.html
Table of Contents
1. Introduction .................................................4
2. HTTP Resource Versioning .....................................5
2.1. Model of time ..............................................5
2.2. Version and Parents headers ................................6
2.2.1. Formatting of Version IDs in versioning headers ..........7
2.2.2. Definition of Events and Versions ........................7
2.3. Using Versioning with HTTP Methods .........................8
2.3.1. GET the current version ..................................8
2.3.2. GET a specific version ...................................9
2.3.3. PUT, POST, or PATCH a new version ........................9
2.3.4. GET a range of historical versions ......................10
2.3.5. Rules for Version and Parents headers ...................12
2.3.6. Interaction with Conditional Request Headers ............12
2.5. Status 432: Version Not Found .............................13
2.6. The Current-Version header ................................13
2.7. Version-Type header .......................................14
2.8. Versioning through intermediaries .........................15
2.8.1. Detecting legacy intermediaries .........................16
3. Analysis of existing versioning approaches in HTTP ..........17
3.1. Approaches with versions in headers .......................17
3.1.1. Last-Modified ...........................................17
3.1.2. ETag ....................................................18
3.2. Approaches defining versions as new resources .............19
3.2.1. Creating additional "Version Resources" .................19
3.2.2. Standardizing versions as external resources ............20
3.4. Requirements for general versioning .......................21
3.4.1. Distributed Time ........................................21
3.4.2. Included within existing HTTP requests and responses ....22
3.4.3. Generalizable Timestamps ................................22
3.5. Feature table of existing approaches ......................23
4. Informative Use-Cases .......................................24
4.1. Incremental RSS Subscription ..............................24
4.2. Hosting git via HTTP ......................................25
4.3. Resumable uploads protocol ................................28
4.4. Distributed collaborative editing .........................30
4.5. Improved Header Compression with runs .....................32
4.6. Run-Length Compression without Header Compression .........35
5. Acknowledgements ............................................36
6. Conventions .................................................36
7. IANA Considerations .........................................37
7.1. HTTP Header Registrations .................................37
7.2. Version Type Registry .....................................37
7.2.1. Procedure ...............................................38
7.2.2. Version Type Registrations ..............................39
7.2.2.1. The peer-counter Version-Type .........................39
7.2.2.1.1. The "text-runs" modifier ............................39
7.2.2.1.2. The "bytestream" modifier ...........................39
7.2.3. Comments on Version Type Registrations ..................40
7.2.4. Change Procedures .......................................40
8. Copyright Notice ............................................41
9. Security Considerations .....................................42
10. Authors' Addresses .........................................43
11. References .................................................43
11.1. Normative References .....................................43
11.2. Informative References ...................................43
1. Introduction
HTTP resources change over time. Any single computer on the network
can observe a resource changing, over time, in a linear sequence of
versions:
o <-- oldest version
|
o
|
o
|
o <-- newest version
We call this a "linear" history.
However, when multiple networked computers change a resource
"simultaneously" -- before their edits propagate to one another --
the global perspective of version history actually forks into a DAG,
or "partial order":
o <-- oldest version
/ \
o o
\ /
o
|
o <-- newest version
HTTP systems often need to specify, identify, and navigate these
histories of versions in order to (1) implement cache consistency,
(2) create history archives, (3) settle race conditions, (4) request
incremental updates to resources, (5) interpret incremental updates
to versions, or (6) merge parallel mutations in distributed
collaborative editing algorithms. Section 4 demonstrates these
use-cases concretely.
Each of these systems needs a way to represent *time*, but different
systems have different requirements for how time is represented.
This document enumerates these needs, and proposes a general
versioning system to satisfy them.
(Note that this document does NOT speak to the versioning of HTTP
APIs -- only HTTP resources, which are used within APIs.)
2. HTTP Resource Versioning
This section defines the core concepts and mechanisms for HTTP
Resource Versioning.
2.1. Model of time
This specification adds a dimension of time to HTTP resources.
Whereas HTTP Resources already have a dimension of *space* -- queried
with the Range header -- this specification provides Resources a
dimension of *time* -- queried by the Version header:
== Time and Space as dimensions of Resources ==
Resource Representation <-- URI + Content-Type: header
^
|
|
| Time <-- Version: header
| /
| /
| /
| /
+----------------> Space <-- Range: header
Any Resource can thus be queried for its Time and Space, via a
Version and Range:
GET /foo
Version: "v1.2.3"
Range: bytes 500-1000
In this model, the term "Version" is synonymous with "Timestamp" -- a
Version represents a point in distributed time. Thus, a Version, on
its own, identifies a version of the Universe of Resources -- the
version that existed at that time. We can use a Version to identify
a point of time for a single Resource, or for multiple Resources, to
identify their states at the same version in time. A Version can be
specified without reference to a particular resource.
It follows that a "Version of a Resource" specifies the state of the
Resource at the time of that Version. The "Version of a
Representation" further implies a particular state of the
Representation body, or "snapshot."
Note that this model differs in terminology from some other
mechanisms, such as WebDAV Versioning, which use the term Version for
what we call a "Version of a Representation."
A resource's "Version History" is the partial order (or DAG) of
Versions of its state. This history has one fork and one merge:
"1" | Time
/ \ |
"2" "3" |
\ / |
"2","3" <-- merge |
| |
"4" V
The semantics of a merge is specified by the resource's Merge-Type.
(See [Merge-Types].)
2.2. Version and Parents headers
This specification introduces two new HTTP headers: Version and
Parents. These headers communicate version information in requests
and responses.
The Version header specifies the current version of a resource:
Version: "dkn7ov2vwg"
The Parents header specifies the immediate predecessor version(s):
Parents: "ajtva12kid", "cmdpvkpll2"
These headers may be used in PUT, PATCH, POST, and DELETE requests,
and GET and HEAD responses, to convey the version before and after
the Update conveyed by the HTTP message. (See [Updates].)
These headers can also be used in GET and HEAD requests to ask a
server for a specific version or range of version history.
If an Update (a PUT, PATCH, POST, or DELETE request, or a GET or HEAD
response) does not specify a Version header, the recipient MAY
generate and assign it a new version ID. If an Update does not
specify a Parents header, the recipient MAY presume that the most
recent versions it has (the frontier of time) are the parents of the
new version.
To describe a merger, a Version header MAY contain multiple IDs:
Version: "dkn7ov2vwg", "v2vwgdkn7o"
2.2.1. Formatting of Version IDs in versioning headers
The Version and Parents headers are formatted as a list of strings
and/or display strings in the Structured Headers format [RFC9651].
Each string is called an Event ID.
Version: <version>
Parents: <version>
<version>: <event-id>, <event-id>, ...
<event-id>: <string> | <display-string>
The ordering of IDs within a Version or Parents header carries no
meaning. Event IDs SHOULD be sorted lexicographically whenever
received or sent, with exactly one space after "," separators, to
canonicalize the set's serialization as a unique string, e.g. as a
unique cache key.
The formatting of Event IDs is constrained according to the
resource's Version-Type, as defined in Section 4.
Note that structured header strings "..." can only contain printable
ASCII. If an Event ID contains unicode beyond printable ASCII, it
should be formatted as a display string %"..." instead.
2.2.2. Definition of Events and Versions
An "Event" is a unique ID assigned to a single event in history at a
peer. Any peer can define an event, by generating a unique ID.
Events are generally defined at state mutations, but events MAY also
be defined at a "no-op mutation", when a unique identifier to mark
time is desired.
Conceptually, for any event E, all prior events that have been
observed by that peer before defining E can be referenced as the set
ancestors(E). The frontier of this set, or frontier(ancestors(E)),
is defined as the events that are not in the ancestors(e') for any
other event e' in ancestors(E). We define the parents(E) to be this
frontier(ancestors(E)).
A Version in time is likewise defined by the set of observed events
at that time, and we define any Version V as the
frontier(observed_events) at that time.
Thus, the version immediately after any event E is simply the set
{E}, and the version after the merger of two parallel events E1 and
E2 is the set {E1, E2}.
As a result, the version of a resource immediately after a mutation
will contain just a single string:
Version: "foo-123"
The version of a merger will generally contain multiple strings:
Version: "foo-123", "bar-abc"
However, it is also possible for a peer to define a no-op event that
represents the merger as a single string:
Parents: "foo-123", "bar-abc"
Version: "{foo-123, bar-abc}"
2.3. Using Versioning with HTTP Methods
The Version and Parents headers can be set in requests and
responses of methods:
- GET
- HEAD
- PUT
- PATCH
- POST
For each of these methods, if a Version or Parents header exists in
the request, it should also exist in the response, and if not, the
server may nonetheless specify it in its response.
We now detail the ways in which these headers modify requests and
responses for these methods.
2.3.1. GET the current version
If the Version: header is not specified, a GET request returns the
current version of the state as usual:
Request:
GET /chat
Response:
HTTP/1.1 200 OK
Version: "ej4lhb9z78"
Parents: "oakwn5b8qh", "uc9zwhw7mf"
Content-Type: application/json
Content-Length: 64
[{"text": "Hi, everyone!",
"author": {"link": "/user/tommy"}}]
The server MAY include a Version and/or Parents header in the
response, to indicate the current version and its parents.
Clients can use a HEAD request to elicit versioning history without
downloading the body:
Request:
HEAD /chat
Response:
HTTP/1.1 200 OK
Version: "ej4lhb9z78"
Parents: "oakwn5b8qh", "uc9zwhw7mf"
Content-Type: application/json
2.3.2. GET a specific version
A server can allow clients to request historical versions of a
resource in GET requests by responding to the Version and Parents
headers. A client can specify a specific version that it wants with
the Version header:
Request:
GET /chat
Version: "ej4lhb9z78"
Response:
HTTP/1.1 200 OK
Version: "ej4lhb9z78"
Parents: "oakwn5b8qh", "uc9zwhw7mf"
Content-Type: application/json
Content-Length: 64
[{"text": "Hi, everyone!",
"author": {"link": "/user/tommy"}}]
2.3.3. PUT, POST, or PATCH a new version
When a PUT, POST, or PATCH request changes the state of a
resource, it can specify the new version of the resource, and the
parent version that it was based on:
Request:
PUT /chat
Version: "ej4lhb9z78"
Parents: "oakwn5b8qh", "uc9zwhw7mf"
Content-Type: application/json
Content-Length: 64
[{"text": "Hi, everyone!",
"author": {"link": "/user/tommy"}}]
Response:
HTTP/1.1 200 OK
The Version and Parents headers are optional. If Version is omitted,
the recipient may assign new event IDs. If Parents is omitted, the
recipient may assume that its current version is the version's
parents.
2.3.4. GET a range of historical versions
A client can request a range of history by including a Parents and a
Version header together. The Parents marks the beginning of the
range (the oldest versions) and the Version marks the end of the
range (the newest versions) that it requests.
Request:
GET /chat
Version: "3"
Parents: "1a", "1b"
Response:
HTTP/1.1 209 Multiresponse
Current-Version: "3"
HTTP/1.1 200 OK
Version: "2"
Parents: "1a", "1b"
Content-Type: application/json
Content-Length: 64
[{"text": "Hi, everyone!",
"author": {"link": "/user/tommy"}}]
HTTP/1.1 200 OK
Version: "3"
Parents: "2"
Content-Type: application/json
Merge-Type: sync9
Content-Length: 117
[{"text": "Hi, everyone!",
"author": {"link": "/user/tommy"}}
{"text": "Yo!",
"author": {"link": "/user/yobot"}]
Note that this example uses a new "Multiresponse" code, which is
currently being drafted [Multiresponse]. See [Braid-HTTP] Section 3
for an earlier draft of the semantics.
If the client just wants the version identifiers for a range of
history, it can use a HEAD request:
Request:
HEAD /chat
Parents: "ej4lhb9z78"
Version: "oakwn5b8qh"
Response:
HTTP/1.1 209 Multiresponse
HTTP/1.1 200 OK
Parents: "ej4lhb9z78"
Version: "i8qa3vu1s0k"
HTTP/1.1 200 OK
Parents: "i8qa3vu1s0k"
Version: "v1voi252yr"
HTTP/1.1 200 OK
Parents: "i8qa3vu1s0k"
Version: "jn4zuwo0wum"
...
2.3.5. Rules for Version and Parents headers
If a GET request contains a Version header:
- If the Parents header is absent, the server SHOULD return a
single response, containing the requested version of the resource
in its body, with the Version response header set to the same
version.
- If the server does not support historical versions, it MAY ignore
the Version header and respond as usual, but MUST NOT include the
Version header in its response.
If a GET request contains a Parents header:
- The server SHOULD send the set of versions updating the Parents
to the specified Version. If no Version is specified, then it
should update the client to the server's current version.
- If the server does not support historical versions, then it MAY
ignore the Parents header, but MUST NOT include the Parents
header in its response.
A server does not need to honor historical version requests for all
documents, for all history. If a server no longer has the historical
context needed to honor a request, it responds with error code 432
Version Not Found.
2.3.6. Interaction with Conditional Request Headers
The Version and Parents headers operate independently of the
conditional request headers (If-Match, If-None-Match,
If-Modified-Since, If-Unmodified-Since) defined in [RFC9110] Section
13. Conditional headers evaluate against entity-tags (ETags) or
modification dates, while Version and Parents reference version
identifiers.
When both are present, the server SHOULD first resolve the requested
version, then evaluate any preconditions.
The Parents header in mutation requests (PUT, PATCH, POST) serves a
related but distinct purpose from If-Match: whereas If-Match causes
the request to fail if the resource's current state does not match,
Parents specifies the version lineage of the mutation, which the
server may use to detect conflicts, perform merges, or construct
version history.
2.5. Status 432: Version Not Found
If a server does not have a version required of a request, it should
respond with status 432 Version Not Found, and include copies of the
request header(s) with versions it could not satisfy. For example,
if a client does:
GET /foo
Version: "alice-44"
Parents: "bob-32"
...but the server does not have the versions 'alice-44', it should
respond with:
432 Version Not Found
Version: "alice-44"
Peers can drop history at any time. Clients cannot rely on any
particular portion of history existing on a server or intermediary
when it makes requests.
2.6. The Current-Version header
While sending historical versions, a server or client can specify its
current latest version with the Current-Version header. The other
party may desire this information to know when it has caught up with
the latest version. This is also used in the resumable uploads
example below.
2.7. Version-Type header
The optional Version-Type header specifies constraints on the format
and interpretation of event IDs. This allows for various
optimizations and specialized versioning schemes.
For example:
Version-Type: git
This could indicate that event IDs will be git-style hashes,
branches, or tags. Peers could verify that the entire repository at
a given version hashes to the specified ID.
Another example:
Version-Type: peer-counter; text-runs
Diamond-Types, Automerge, and other algorithms use this format to
compress history metadata through run-length encoding of consecutive
insertions. This allows a set of 50 inserted characters to be stored
as 50 bytes plus one event ID, rather than 50 bytes plus 50 event
IDs (each of which takes up multiple bytes).
Implementers may define custom Version-Types to suit specific needs:
Version-Type: version-vector
A Version Vector Version ID might take the form:
Version: "{peerid1: counter1, peerid2: counter2, ...}"
Version Vectors enable direct computation of partial order between
any two event IDs without examining the full version history graph.
(To know the order between two Version Vectors A and B, one needs
only to compare each peer's counter between A and B. If A dominates
across all peers, it is newer. If B dominates, then it is newer.
Otherwise, the ordering between the two version vectors is not known,
and we can say that they happened in parallel.)
2.8. Versioning through intermediaries
Intermediaries can take advantage of versioning to uniquely
reference, store, and serve multiple states/updates across a
resource's history. To distinguish versions, intermediaries must add
the "Version" and "Parents" headers to their cache keys -- equivalent
to the Vary header:
Vary: version, parents
Intermediaries SHOULD behave as if the Version and Parents headers
have been added to the Vary header in every response passing through
them. To support legacy versioning-unaware intermediaries, the
origin server is RECOMMENDED to explicitly add or extend Vary with
"version, parents" in all its responses, unless it is certain that no
legacy intermediaries will process the response.
2.8.1. Detecting legacy intermediaries
In the case that a legacy intermediary *does* process a versioned
response without the Vary header, it can be detected by the client
noticing that the Version and Parents Event IDs in a client request
are not present in the request's response. Here is an example:
Presume we start with two versions:
PUT /foo
Version: "1"
Hello
PUT /foo
Version: "2"
Hello World!
Now, if someone GETs the old version:
GET /foo
Version: "1"
Then a versioning-aware origin server can return it:
HTTP/1.1 200 OK
Version: "1"
Hello
However, a versioning-unaware intermediary will cache this old
response as if it is the current state.
This breaks when a client requests the *newest* version:
GET /foo
Version: "2"
And the cache ignores the Version header and returns what it has
most recently seen:
HTTP/1.1 200 OK
Version: "1"
Hello
To detect this, the client SHOULD check that the Version and Parents
headers in the response contain a superset of the Event IDs specified
in the request. (A missing Version or Parents header is considered
to be an empty set for the version or parents of this superset
calculation.) The client can throw a warning to the programmer when
encountering a response that does not contain the request's Event
IDs, so that he knows to add Vary to the server's responses.
3. Analysis of existing versioning approaches in HTTP
Existing approaches to versioning in HTTP address disparate
use-cases, but have limitations and trade-offs. The Last-Modified
and ETag headers were invented for cache consistency, but do not
provide an ordering of version history through time, nor do they
handle forks and merges in distributed time. On the other hand, a
number of forking/merging versioning systems have been proposed
(WebDAV, Link Relations) that define new resources to represent
versions of existing resources, but these require additional network
round-trips to query and learn about these new version resources. No
HTTP versioning system today allows for articulating custom
distributed timestamp formats such as version vectors.
We now enumerate the existing approaches. The following section
provides an overview of their limitations for a general versioning
system.
3.1. Approaches with versions in headers
The "Last-Modified" and "ETag" headers are by far the most
commonly-used versioning system used in HTTP today. They specify a
resource's version directly in request and/or response headers. This
is simple and efficent, as versions are communicated immediately,
without extra round trips, and very practically useful for caching
and conditional requests. However, the version format of
Last-Modified and ETag do not provide enough information to compute
an accurate order of versions, and thus cannot be used to represent
distributed time.
3.1.1. Last-Modified
The Last-Modified header specifies a wallclock timestamp that caches
and clients can use to know when a change has occurred:
Last-Modified: Sat, 6 Jul 2024 07:28:00 GMT
This header is useful for caching and conditional requests (using the
If-Modified-Since header). However, it has several limitations:
1. It is limited to the precision of the wallclock. If a resource
changes within the same second, the Last-Modified date won't
change, and caches can become inconsistent.
2. It stores linear time; not distributed (partial order) time. It
cannot represent the ambiguity of forked time.
3. It is susceptible to clock skew in distributed systems,
potentially leading to inconsistencies across different servers.
3.1.2. ETag
The ETag header allows more precision. It specifies a version with a
string that uniquely identifies a cacheable representation:
ETag: "2u34fa7yorz0"
ETags can be strong or weak, with weak ETags prefixed by W/:
ETag: W/"2u34fa7yorz0"
ETags are used in conditional requests with If-None-Match and
If-Match headers and can be used for optimistic concurrency
control. However:
1. While helping with cache validation, ETags are not accurate
markers of time. There is no way to order versions by ETag, or
know which version came before another.
2. ETags are unique to content, not timestamps. It's possible for
the same ETag to recur over time if the resource changes back and
forth between a common state.
3. Strong ETags are sensitive to Content-Encoding. If a single
version of a resource is transmitted with different
Content-Encodings (e.g., gzip), it will be sent with different
strong ETags.
Thus, one can have multiple ETags for the same version in history, as
well as a single ETag for multiple versions of history.
3.2. Approaches defining versions as new resources
Rather than placing versioning information into headers, a second set
of approaches create new "version resources" for each version of an
original resource, and then add ways to navigate the "version
resources."
A variety of approaches exist -- from hand-copied "foo-v2.html" URIs,
to standards like WebDAV. However, they all require additional
round-trips, with new URIs or new methods, to discover and navigate
version information, which reduces performance, adds complexity, and
makes it challenging for intermediaries to use versioning information
without performing additional network requests themselves.
3.2.1. Creating additional "Version Resources"
Application programmers often define copies of a resource, at various
versions, by encoding a version identifier into its URI:
https://unpkg.com/[email protected]/index.js
This approach is common, but has several limitations:
1. It loses the semantics of a "resource changing over time."
Instead, it creates multiple version resources for each resource.
2. It then necessitates standards to, given a URI, extract the
current version of the resource, get the previous and next
version(s), understand the format of the version(s) (e.g.,
major.minor.patch), and to create new versions. These standards
are discussed next.
3. It can lead to URI proliferation, complicating a web application's
URI space, and potentially impacting caching strategies and SEO.
3.2.2. Standardizing versions as external resources
A number of approaches enable ways to organize and navigate a set of
"version resources." Memento [RFC7089] provides read-only access to
prior states through datetime negotiation, but defines no mechanism
for creating new versions. WebDAV Versioning [RFC3253] provides a
full lifecycle through CHECKOUT/CHECKIN methods, but requires
server-assigned version identifiers. Link Relations for Version
Navigation [RFC5829] defines link relation types for navigating
between version resources, but leaves the creation and management of
those versions to other protocols. WebDAV and Link Relations also
allow for a partial order of distributed time, whereas Memento is
limited to linear wallclock time.
However, these approaches all require dereferencing additional URIs
(and thus additional network round-trips) to discover version
information, rather than embedding it inline in the headers of a
normal GET or PUT to the resource. This prevents intermediaries from
organizing the version history without doing additional network
requests themselves. Furthermore, although WebDAV supports forking
and merging history, the methods it provides to manipulate the
history (CHECKIN, CHECKOUT, VERSION-CONTROL, LOCK, UNLOCK) require a
centralized server.
3.4. Requirements for general versioning
This section defines the requirements for a general versioning
standard. It then provides a feature table enumerating which
requirements the existing approaches support and do not support.
These requirements are derived from the use-cases described in
Section 4, which together span the range of versioning needs we have
identified in HTTP systems -- from cache consistency and incremental
updates, to distributed collaborative editing and hosting
version-control systems like git.
3.4.1. Distributed Time
Distributed systems (see examples in section 4) require support for
distributed time, including:
(a) A comparable partial order:
Any two Versions (timestamps) A and B must be comparable, such
that either:
- A > B: A came after B
- A < B: A came before B
- A <> B: Neither A nor B came before the other
(b) Immune to clock skew or limited precision:
The ordering must not depend on the skew or precision of a
CPU's wallclock.
(c) Merges must be specifiable without central authority:
Any peer must be able to identify the version resulting from
the merger of two parallel versions A and B, without relying on
a central authority to assign a new version identifier AB.
3.4.2. Included within existing HTTP requests and responses
The versions should be included within existing HTTP requests and
responses, wherever possible, without requiring additional requests
and round-trips. This simplifies implementation, improves
performance, and allows intermediaries to understand and accurately
cache the version history of state as it flows over the wire, without
having to make additional requests themselves. It also allows
applications to add versioning information to their existing
resources without impacting the definition of their existing URI
space -- an unnecessary burden for many applications.
We enumerate these requirements as follows:
(a) Includes versioning in every existing request and response
where state is queried or mutated.
(b) Does not require additional round-trips.
(c) Does not require polluting the application's URL namespace.
3.4.3. Generalizable Timestamps
Some systems have requirements for Version timestamps themselves:
(a) Extensible timestamp formats:
Advanced distributed systems often devise special formats for
partially-ordered timestamps that allow inferences for
improved performance, such as lamport clocks, vector clocks,
version vectors, hash histories, hybrid logical clocks, and
append-only-log indices. Implementations can rely on
information embedded in these timestamps to compress history
metadata, optimize partial-order computations, or infer the
value of state.
(b) Independent of resource:
Some applications need to assign a single version to multiple
resources, so that one can refer to multiple resources at the
same point in time. For instance, a git repository (see
section 4.2) commits multiple files at the same version.
A general versioning system must provide version identifiers
that are independent of any particular resource, so that
multiple resources can be versioned and referred to at the
same points in time.
3.5. Feature table of existing approaches
We compose a table of existing approaches, and the features they do
not yet provide, here:
| 1a | 1b | 1c | 2a | 2b | 2c | 3a | 3b |
------------------------------------------------------------
Last-Modified | - | - | - | X | X | X | - | X |
ETags | - | - | - | X | X | X | - | - |
Memento | - | ~ | - | - | - | ~ | - | - |
WebDAV Versioning | X | X | - | - | - | - | - | - |
Link Relations | X | X | - | X | - | ~ | - | - |
------------------------------------------------------------
Because no existing versioning approach satisfies all needs,
programmers today must implement multiple approaches to versioning in
their applications -- each with subtly different logic -- and cannot
implement common infrastructure for distributed versioning,
archiving, and collaborative editing that works across HTTP systems.
This document specifies a general versioning model satisfying all 8
requirements above. We start by specifying how to add versioning to
HTTP requests and responses.
4. Informative Use-Cases
A general mechanism for versioning HTTP resources could enable a
number of new use-cases:
- RSS clients could request incremental updates when polling,
instead of re-downloading redundant unchanged feed items after
each change to any item
- Servers could accept incoming patches based on old or parallel
versions of history, and even rebase those patches for other
clients, at other points in history
- Collaborative editing could be built directly into HTTP
resources, providing the abilities of Google Docs at any URL
- Git repositories could be hosted directly over HTTP; rather than
embedding versioning information within opaque blobs that use
HTTP just as a transport
- Caches and archives could hold and serve multiple versions of a
resource, enabling audits and distributed backups
- Distributed databases could standardize network APIs to HTTP,
while retaining distributed consistency guarantees