We had been having an issue with losing cookies in Varnish. Because of how Varnish works, the way to work around cookies is to store them in headers inbetween states of the request process. We were finally able to get some data out of varnishlog that looked like this.
- ReqHeader DN-Varnish-Offer-Group:
- LostHeader DN-Varnish-Offer-Sort: price
- Error out of workspace (req)
- LostHeader DN-Varnish-Sbtab-Sorts:
- Error out of workspace (req)
- LostHeader DN-Varnish-Use-View:
- Error out of workspace (req)
- LostHeader DN-Varnish-Ux-Variant: classic_site
- Error out of workspace (req)
When you first read these errors, you will likely find the settings workspace_client and workspace_backend. Those seem like very logical settings to tweak. However, no matter how big we set them nothing helped. We graph stats coming out of Varnish using the prometheus exporter. We found the metric varnish_main_ws_backend_overflow. That made us believe even more that this was a workspace_backend limit we were hitting. It turns out, there is more to the workspace settings than just this. I read through an old issue on Github and found some folks trying to set other settings related to header size and header limits. In the end, that was our issue. We increased these settings and our overflows disappeared.
http_req_hdr_len = 64k (default is 8k)
http_req_size = 128k (default is 32k)
http_max_hdr=256 (default is 64)
Hopefully this will help someone else that runs up against this.