I’ve banned query strings

chrismorgan.info - 433 poäng - 231 kommentarer - 71631 sekunder sedan

Kommentarer (52)

jedimastert - 62207 sekunder sedan
You know I was actually really curious about this so I went back to the HTML and URL W3C standards and surprisingly they don't actually have any definitions of format other than being percent encoded. One might conflate query strings with "form-urlencoded"[0] query strings, which is one potential interoperability format, but in general a queries string is just any percent encoded string following a "?" in a url[1], and just another property in the "URL" HTML object that can be used in the generation of a response. While additionally there is a URLSearchParams object that is the result of parsing the query string with the form-urlencoded parser, this is simply an interoperability layer for JavaScript.
I'm going to be honest, I was pretty geared up to have a contrarian opinion until I looked at the standards but they're actually pretty clear, a 404 could be a proper response to unexpected query string; query string is as much part of the URL API as the path is and I think pretty much everyone can acknowledge that just tacking random stuff onto the path would be ill advised and undefined behavior.
[0]: https://url.spec.whatwg.org/#application/x-www-form-urlencod...
[1]: https://url.spec.whatwg.org/#url-class
andersmurphy - 2759 sekunder sedan
Yup do what works best for you.
I do the opposite I don't support path params. Means your router can just be a simple map/dict.
Query strings avoid all the hierarchy/taxonomy problems you run into with path params.
wodenokoto - 38481 sekunder sedan
So my understanding is, he is annoyed that other website adds a query string such as "?ref=origin.com" to links pointing to authors website.
How does this benefit the other website? How does this hurt the authors website?
I am completely confused about the behavior of both side here.
I get that when I run an ad-campaing I want google to add a utm-query string, so I can track which campaign users arrived from - but then the origin and the destination are working together. Here the origin just adds stuff for no reason. Why?
ChrisMarshallNY - 60085 sekunder sedan
> It is a small, decentralised, self-hosted web console that lets visitors to your website explore interesting websites and pages recommended by a community of independent personal website owners.
Back in the Stone Age, we called these “Webrings,” but they weren’t as fancy.
One of the issues that I faced, while developing an open-source application framework, was that hosting that used FastCGI, would not honor Auth headers, so I was forced to pass the tokens in the query. It sucked, because that makes copy/paste of the Web address a real problem. It would often contain tokens. I guess maybe this has been fixed?
In the backends that I control, and aren’t required to make available to any and all, I use headers.
1shooner - 64133 sekunder sedan
>So I’ve decided to try a blanket ban for this site: no unauthorised query strings.
His site returns (I think incorrectly) a 414 if a request includes a query string. If this protest is meant to advocate for the user, who presumably wasn't able to manage that string in the first place, why would you penalize them for it being there?
Why not just use it as a cue to tell users how they can make this decision themselves (e.g. through browser tools)?
Aardwolf - 58500 sekunder sedan
> You could argue that I’m abusing 414 URI Too Long. I respond that it’s funnier this way. Other options I considered were:
Another option to consider is "418 I'm a teapot": teapots usually also don't support query strings
humodz - 60971 sekunder sedan
The tone of this and Chris's post gives me the impression that it's harmful to include these query parameters, but I don't understand how. Could someone elucidate me? I understand it can mangle some URLs and that's good enough reason not do it, but even then it seems like a minor incovenience.
dspillett - 48907 sekunder sedan
Maybe an alternative would be to inconvenience people following such links still, but somewhat less.
Instead of responding with an error, give a page that states “The link you followed to get here appears to have had some tracking gubbins added, in case you are a bot following arbitrary links, and/or using random URL additions to look like a more organic visit, please wait while we run a little PoW automaton deterrent before passing you on to the page you are looking for.” then do a little busy work (perhaps a real PoW thingy) before redirecting. Or maybe don't redirect directly, just output the unadorned URL for the user to click (and pass on to others). This won't stop the extra gubbins being added of course, but neither will the error and this inconveniences potential readers less.
dang - 59619 sekunder sedan
Since the original source hadn't had a discussion on HN yet, I've put that link (https://chrismorgan.info/no-query-strings) at the top and moved the response link (https://susam.net/no-query-strings.html) to the toptext.
Both are good but it seems fair to give priority to the original.
stickfigure - 26720 sekunder sedan
This basically boils down to "reject any incoming links from facebook, pinterest, chatgpt, linkedin, twitter, reddit, youtube, etc". I guess sure? There's a once-famous guy who shows goatse to all referring links from HN. I guess if you get enough traffic that you can pick which sources you want to allow, that's a good problem to have.
xp84 - 33093 sekunder sedan
I love the hilarious output. He even coded in a special case for just a question mark without any params:
https://chrismorgan.info/no-query-strings?
Never have I seen such a sassy web server
hamdingers - 56195 sekunder sedan
While I don't take the author's hard stance, I do hate gratuitous query params that result in links that are thousands of characters long.
I use this bookmarklet to strip query params before sharing a link:
```
    javascript:(()=>navigator.clipboard.writeText(location.origin+location.pathname))();
```
takethebus - 9319 sekunder sedan
I was just thinking about something like this. Instagram and TikTok are major offenders, not everyone wants their personal info blasted everywhere because they copied a link. It would be great to have an iOS shortcut that automatically removes them, it's something I'm going to look into.
zzo38computer - 37082 sekunder sedan
Query strings do have uses (such as for searching files and some other kind of dynamic files), but you shouldn't add them to URLs that should not expect them. So, I agree that they would be right to refuse requests with UTM and other stuff added like that.
I think 404 probably makes the most sense as the response if a query string is not expected but is present anyways, although 400 might also be suitable.
gtowey - 64931 sekunder sedan
"wander console" sounds like they're just web rings re-invented. In the era of forced feeds by giant corporations which consist of the things they want you to see, I've wondered if this old idea would make a comeback. Human curated content from trusted people seems like the only way forward.
elAhmo - 8204 sekunder sedan
Such a useless blog post and initiative. Author control his website, and if someone lands there by clicking a link, it is not user's fault.
peesem - 57678 sekunder sedan
edit: not true https://news.ycombinator.com/item?id=48077990
"I don’t like people adding tracking stuff to URLs" and "You abuse your users by adding that to the link" and "no unauthorised query strings" and "At present I don’t use any query strings" but for some reason ?igsh, which i'm pretty sure is an instagram tracking parameter, is allowed. weird
jameshart - 54545 sekunder sedan
There’s nothing ruder in hypertext etiquette than giving someone a link to navigate to someone else’s HTTP server, where you have manipulated that URL in some way unsanctioned by the server you are sending them to.
You can’t just send arbitrary query string parameters to a server and assume they will just ignore them. Just like you can’t just remove query string parameters and assume the URL will work.
- 11556 sekunder sedan
sigseg1v - 64695 sekunder sedan
Adding query strings is one of those things that I think a lot of sites could get away with more easily if they were reasonable about it.
A link that is "https:// web.site" is fine.
A link that is "https:// web.site?via=another.site" is fine.
A link that is "https:// web.site?fbm=avddjur5rdcbbdehy63edjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63edaaaddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednzzddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63edn"
is annoying as shit and I need to literally apologize to people after sending it if I forget to manually redact the query string. Don't abuse this.
patrickdavey - 32153 sekunder sedan
"It’s my website: I can do what I want with it. "
Right on! It's so liberating having your own wee corner of the internet.
gpvos - 53738 sekunder sedan
This is not the first site to do so. A few years back, scarygoround.com started blocking query strings, although it seems to have stopped doing so now. Back then, Facebook had started to add ?fbclid=... to every outgoing link.
madprops - 54394 sekunder sedan
>Right click a youtube video from the results to copy the URL. I would have liked a short URL ready to share with people in chats, but no, I get: https://www.youtube.com/watch?v=IFfLCuHSZ-U&pp=ygUNcmF0Ym95I...
>Want to share an amazon product on a chat to discuss about it. I would have liked a nice short url that I can copy, instead I get a monstrosity, it forces me to manually select only the id portion of it if I want to share it.
noduerme - 15407 sekunder sedan
Most of the sites that still use GET queries around here are the tax collection sites run by local governments, which pass those variables around after you login the way your mom... uh, it's HN, skip the mom joke.
I actually get a lot more annoyed by routing parsers that do the same thing Get requests do only by pretending to be a real URL.
arjie - 63994 sekunder sedan
Just referrer policy of strict origin when cross origin gives host level referer (sic) header in most mainstream browsers unless user has configured otherwise right? That’s usually enough for web authors to know what audience they’re appealing to and privacy-maximizers can turn off that header sending.
sutterd - 42289 sekunder sedan
This url worked fine:
https://chrismorgan.info/no-query-strings#:~:text=So%20I%E2%...
but this one was too long:
https://chrismorgan.info/no-query-strings?a=1
moritzwarhier - 151491 sekunder sedan
This is cool and creative!
It uses 4xx, but not just 400 :)
https://chrismorgan.info/no-query-strings?why=unknown
itopaloglu83 - 54410 sekunder sedan
YouTube is also quite famous with their source identifiers, especially with the short urls, the tracking part is longer then the url I’m trying to share.
codingclaws - 52939 sekunder sedan
I was just wondering if I should do something like this. I use a couple query string values and I validate them and issue a 40x if the value is invalid. So, I was wondering if I should issue a 40x for an unused query string val.
gwern - 63313 sekunder sedan
Query strings break unpredictably, and that alone is enough to ban them by third parties, especially for something as minor as referral tracking.
Example: The Browser is a well known link aggregation paid periodical. I subscribe, and every 1 in 10 or 20 links I clicked, it'd just break outright and I'd have to tediously edit the URL to fix it (assuming the website didn't do a silent ninja URL edit and make it impossible for me to remember what URL I opened possibly days or weeks ago in a tab and potentially fix it). This was annoying enough to bother me regularly, but not enough to figure out a workaround.
Why? ...Because TB was injecting a '?referrer=The_Browser' or something, and the receiving website server got confused by an invalid query and errored out. 'Wow, how careless of The Browser! Are they really so incompetent as to not even check their URLs before mailing an issue out to paying subscribers?'
I wondered the same thing, and I eventually complained to them. It turns out, they did check all their URLs carefully before emailing them out... emphasis on 'before', which meant that they were checking the query-string-free versions, which of course worked fine. (This is a good example of a testing failure due to not testing end-to-end or integration testing: they should have been testing draft emails sent to a testing account, to check for all possible issues like MIME mangling, not just query string shenanigans.)
After that they fixed it by making sure they injected the query string before they checked the URLs. (I suggested not injecting it at all, but they said that for business reasons, it was too valuable to show receiving websites exactly how much traffic TB was driving to them on net, because referrers are typically stripped from emails and reshares and just in general - this, BTW, is why the OP suggestion of 'just set a HTTP referrer header!' is naive and limited to very narrow niches where you can be sure that you can, in fact, just set the referrer header.)
But this error was affecting them for god knows how long and how many readers and how many clicks, and they didn't know. Because why would they? The most important thing any programmer or web dev should know about users is that "they may never tell you": https://pointersgonewild.com/2019/11/02/they-might-never-tel... (excerpts & more examples: https://gwern.net/ref/chevalier-boisvert-2019 ). No matter how badly broken a feature or service or URL may be, the odds are good that no user will ever tell you that. Laziness, public goods, learned helplessness / low standards, I don't know what it is, but never assume that you are aware of severe breakage (or vice-versa, as a user, never assume the creator is aware of even the most extreme problem or error).
Even the biggest businesses.... I was watching a friend the other day try to set up a bank account in Central America, and clicking on one of the few banks' websites to download the forms on their main web page. None of the form PDF download links worked. "That's not a good sign", they said. No, but also not as surprising as you might think - the bank might have no idea that some server config tweak broke their form links. After all, at least while I was watching, my friend didn't tell them about their problem either!
- 47747 sekunder sedan
notlive - 58041 sekunder sedan
Referrer is sometimes nice to know. If your site gets a traffic spike from an email newsletter that traffic won't correctly identify the source in the http headers.
No qualms with OP, your site your rules.
donohoe - 42741 sekunder sedan
A neat and funny idea - but in the end it is hostile to the users who don’t always control what’s added to links.

llimllib - 35661 sekunder sedan

> curl, for example, seems to illegitimately strip a trailing question mark (could be only for the command line, didn’t test library usage).

umm what? I don't know what they're actually sending where they think this, but if you think curl is broken you should re-think that maybe you're the one doing something wrong.

Here are some examples showing curl not stripping question marks (obviously), I am very curious what this person was actually seeing

    $ curl -s 'https://httpbingo.org/get?' | jq .url
    "https://httpbingo.org/get?"
    $ curl -s 'https://httpbingo.org/get?path' | jq .url
    "https://httpbingo.org/get?path"
    $ curl -s 'https://httpbingo.org/get?path,query=bananas' | jq .url
    "https://httpbingo.org/get?path,query=bananas"
    $ curl -s 'https://httpbingo.org/get????' | jq .url               
    "https://httpbingo.org/get????"
    $ curl -sv 'https://httpbingo.org/????' 2>&1 | grep :path
    * [HTTP/2] [1] [:path: /????]

legitster - 59992 sekunder sedan
Query strings are awesome. Especially for one-page applications.
I build a lot of internal applications, and one of my golden UI rules is that a user should be able to share their URL and other users should be able to see exactly what the sender did.
So if you have a dashboard or visualization where the user can add filters or configurations, I have all of their settings saved automatically in the URL. It's visible, it's obvious, it's easy, it's convenient.
>There is also a moral question here about whether it is okay to modify a given URL on behalf of the user in order to insert a referral query string into it. I think it isn't.
These dogmatic technical screeds are all so weird to me. They usually reveal more about the authors lack of experience or imagination than provide a useful truism.
dredmorbius - 56205 sekunder sedan
This is genius, kudos Chris.
It also makes me wonder what other noxious online behaviours might be addressed through ... creative ... client-side responses similar to this.
We've already seen, for years, sites attempting to socially-condition people over the use of ad-blockers and Javascript disablers. No reason why the Other Side can't fight back as well.
julianlam - 64738 sekunder sedan
> After I implemented that feature, a page from one of my favourite websites refused to load in the console... the third URL returns an HTTP 404 error page. The website uses the query string to determine which one of its several font collections to show.
Yes, let's unilaterally decide that query strings are bad because one website (ab)uses query strings to load different fonts.
It's the query strings that are the problem, not the website!
jfc.
Look, I'm against utm fragments as much as the next guy, but let's not throw away a perfectly good thing because tracking is evil.
lofaszvanitt - 36171 sekunder sedan
IMDB recently went haywire
they added these ugly qses into every click on their site, bonkers: ?ref_=nm_ov_bio_lk
gojomo - 53651 sekunder sedan
Trying to boostrap some taboo against novel unpermissioned URL munging is silly prudishness.
Ensuring both sides of a hyperlink agree/consent was a design flaw that limited the uptake of pre-web hypertext systems. The web's laissez-faire approach demonstrated a looser coupling was far better for users, despite all the new failure modes.
Of course any site/server has the practical power free to treat inbound requests as rigorously (or harshly) as they want. But by the web's essential nature, it is equally part of the inherent range-of-freedom of outlink authors to craft their URLs (and thus the resulting requests) however they want. URLs are permissionless hyperlanguage, not the intellectual property of entities named therein.
Plenty of sites welcome such extra info, and those that don't want it can ignore it easily enough – including by just not caring enough about the undefined behavior/failures to do nothing.
Though, when a web publisher has naively deployed a system that's fragile with respect to unexpected query-string values, they should want to upgrade their thinking for robustness, via either conscious strictness or conscious permissiveness. Thereafter, their work will be ready for the real web, not a just some idealized sandbox where scolding unwanted behavior makes sense.
himata4113 - 26360 sekunder sedan
?referrer=123 still works, so I guess it's selective.
ashley95 - 48708 sekunder sedan
But ?fbclid is not banned?
arexxbifs - 57029 sekunder sedan
Running your own small website is a constant battle against grifters and bad online etiquette. When people hotlink images, I usually make a point of having some personal fun with mod_rewrite.
fragmede - 39501 sekunder sedan
It's just a string though. A project that I'll never get to is a custom webserver so that QR codes can use the smaller characterset, so it can link to a URL with parameters without forcing the larger character set.
lloydatkinson - 54752 sekunder sedan
This is really cool. My site is hosted by cloudflare, so I guess I could do the same with a cloudflare worker... maybe?
- 43597 sekunder sedan
throw310822 - 17642 sekunder sedan
Whatever floats your boat
shevy-java - 55848 sekunder sedan
> It’s my website: I can do what I want with it.
> And you can do what you want with yours!
That does not make a lot of sense. Yes, you can do what you want with your website, but query-string is a way for users to query for additional information or wants or needs. I use them on my own websites to have more flexibility. For instance:
```
    foobar.com/ducks?pdf
```
That will download the website content as a formatted .pdf file.
I can give many more examples here. The "query strings are horrible" I can not agree with at all. His websites don't allow for query strings? That's fine. But in no way does this mean query strings are useless. Besides, what does it mean to "ban" it? You simply don't respond to query strings you don't want to handle. We do so via general routing in web-applications these days.
ironfront - 59450 sekunder sedan
[flagged]
Jimmy0252 - 45614 sekunder sedan
[dead]
nullsanity - 29660 sekunder sedan
[dead]
huflungdung - 51169 sekunder sedan
[dead]
willthefirst - 59259 sekunder sedan
I mean…the site that broke should know what to do with arbitrary query strings. If your site breaks when someone puts in an invalid query string, that’s on you?