Quantcast
Viewing latest article 21
Browse Latest Browse All 65

Answer by Sergey Ponomarev for How ETags are generated and configured?

Overview of typical algorithms used in webservers.Consider we have a file with

  • Size 1047 i.e. 417 in hex.
  • MTime i.e. last modification on Mon, 06 Jan 2020 12:54:56 GMT whichis 1578315296 seconds in unix time or 1578315296666771000 nanoseconds.
  • Inode which is a physical file number 66 i.e. 42 in hex

Different webservers returns ETag like:

  • Nginx: "5e132e20-417" i.e. "hex(MTime)-hex(Size)". Not configurable.
  • BusyBox httpd the same as Nginx
  • monkey httpd the same as Nginx
  • Apache/2.2: "42-417-59b782a99f493" i.e. "hex(INode)-hex(Size)-hex(MTime in microseconds)". Can be configured but MTime anyway will be in nanos
  • Apache/2.4: "417-59b782a99f493" i.e. "hex(Size)-hex(MTime in microseconds)" i.e. without INode which is friendly for load balancing when identical file have different INode on different servers.
  • OpenWrt uhttpd: "42-417-5e132e20" i.e. "hex(INode)-hex(Size)-hex(MTime)". Not configurable.
  • Tomcat 9: W/"1047-1578315296666" i.e. Weak"Size-MTime in milliseconds". This is incorrect ETag because it should be strong as for a static file i.e. octal compatibility.
  • LightHTTPD: "hashcode(42-1047-1578315296666771000)" i.e. INode-Size-MTime but then reduced to a simple integer by hashcode (dekhash). Can be configured but you can only disable one part (etag.use-inode = "disabled")
  • MS IIS: it have a form Filetimestamp:ChangeNumber e.g. "53dbd5819f62d61:0". Not documented, not configurable but can be disabled.
  • Jetty: based on last mod, size and hashed. See Resource.getWeakETag()
  • Kitura (Swift): "W/hex(Size)-hex(MTime)"StaticFileServer.calculateETag
  • JS lib jshttp/etag "hex(Size)-hex(MTime)"stattag
  • H2O (C) "hex(MTime)-hex(Size)"h2o_filecache_get_etag

Few thoughts:

  • Hex numbers are used here so often because it's cheap to convert a decimal number to a shorter hex string.
  • Inode while adding more guarantees makes load balancing not possible and very fragile if you simply copied the file during application redeploy.MTime in nanoseconds is not available on all platforms and such granularity not needed.
  • Apache have a bug about this like https://bz.apache.org/bugzilla/show_bug.cgi?id=55573
  • The order MTime-Size or Size-MTime is also matters because MTime is more likely changed so comparing ETag string may be faster for a dozen CPU cycles.
  • Even if this is not a full checksum hash but definitely not a weak ETag. This is enough to show that we expect octal compatibility for Range requests.
  • Apache and Nginx shares almost all traffic in Internet but most static files are shared via Nginx and it is not configurable.

It looks like Nginx uses the most reasonable schema so if you implementing try to make it the same.The whole ETag generated in C with one line:

printf("\"%" PRIx64 "-%" PRIx64 "\"", last_mod, file_size)

My proposition is to take Nginx schema and make it as a recommended ETag algorithm by W3C.


Viewing latest article 21
Browse Latest Browse All 65

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>