If you don’t secure your data, it’s not unauthorized access
A recent court opinion out of Eastern District of Pennsylvania in Healthcare Advocates Inc. v. Harding Earley Follmer & Frailey, No. 05-3524, is worthy of note. Law.com reports:
A law firm did not violate copyright and computer anti-hacking laws when it used a Web archive search tool to recover old Web pages of its client’s adversary, says a federal judge.
Although the archived pages were supposed to be shielded from public view, the protections failed and lawyers at Harding Earley Follmer & Frailey in Valley Forge, Pa., did not hack their way in, Eastern District of Pennsylvania Judge Robert Kelly Jr. ruled last week on summary judgment.
“They did not ‘pick the lock’ and avoid or bypass the protective measure, because there was no lock to pick,” Kelly wrote in Healthcare Advocates Inc. v. Harding Earley Follmer & Frailey, No. 05-3524. “Nor did the Harding firm steal passwords to get around a protective barrier. … The Harding firm could not ‘avoid’ or ‘bypass’ a digital wall that was not there.”
[...]
Internet Archive was created as a registry of past Web pages that would be available to the public once they were no longer considered useful, said Kelly. Since the archived pages in question were once available for the world to view on Healthcare Advocates’ site, the “Harding firm had no reason to anticipate that using a public website to view images of another public website would subject them to a civil lawsuit containing allegations of hacking,” he found.
Internet Archive tells Web page creators and users that if they want to have their archived pages blocked from public view, they should insert a “robot.txt” file that is supposed to be recognized by archive software.
Had Harding Earley lawyers knowingly evaded or broken through robot.txt files to gain access to the documents, allegations of hacking may have carried more weight. But here, the robot.txt files were not detected.
[...]
Read more at Law.com.
So if you make a mistake and something gets archived or cached, you cannot complain about “unauthorized access” or make any claim about right to view material, etc. We’ve seen this before with Google caching files that were never intended to be public, and it still troubles me that there is a seeming acceptance of “If you don’t secure, they can archive it.” I still think that no-archive should be the default state, and that if there’s nothing in a robots.txt file that gives a search engine the right to archive, it should not be allowed to.
Yes, I know that’s unrealistic because even if we had a law like that here, it could be archived by a non-U.S. search engine and would be available that way, but there’s something that seems wrong about saying, “You have to become an expert in security or we can archive whatever you unintentionally leave open to view, even if it means we have to go following links to find it.”
The world wide web is, by default, a public access area. Are you an expert lock smith? (Probably not.) Yet, you can still tell if the door to your domicile is locked or not and if you leave it standing wide open and someone comes in, I doubt it is considered “breaking and entering”. In other words, if you want to make sure something is secure, test it. Use an off-site computer and see if you can get to the data. Or, don’t put the information on the world wide web in the first place.
You’re right, but why should individuals suffer because those who are custodians of their details do not adequately test and secure?
As to “breaking and entering:” according to the penal laws of my state, even if I leave the door open, it is still my dwelling and anyone who entered it would be committing “trespass.” If they steal anything while trespassing, it would be “burglary.” If they came in to my house because they’re homeless and cold and wanted to get warm and didn’t steal anything, it would still be trespass.
So… how do we apply that to the internet? If you come into my server because I left a “door open,” are you trespassing? Yes, I can see someone arguing, “Well, how do I know that you didn’t mean to invite me in because the file is on a public server?” but really, sometimes it has to be obvious that some files were never intended for public view, no?
And if after you enter my “dwelling,” you copy my data or take it, how is that not “burglary?”
Over on Emergent Chaos, Mordaxus picked up on my post and there’s some discussion of it over there. I readily admit that I am unsophisticated on this issue and am only concerned that the ultimate victims are not being adequately protected by the laws.
When you put something on a public website it is not analogous to leaving the back door unlocked in your home. It is more like you took your personal belongings into the street and then expected no one to mess with them.
An even better analogy would be if a business opened a storefront office and placed their business sensitive data (e.g., a company ledger) on a table on the sidewalk for all to see. The business owner can’t complain about people reading the ledger if he publishes it.
As the previous commentator noted, don’t open up your website to the public and complain about privacy violations. No one is directly connected to the Internet in this fashion without making an effort to be so connected. With that connectivity to a public forum comes the responsibility to protect oneself from unwanted exposure.
I agree that anyone who walks by a storefront on the public street may see the ledger and read it and the owner of the ledger has no right to gripe about a privacy violation. But to pursue that analogy, would you say that a newspaper has a right to make a copy of that ledger and then publish it on their site — or wouldn’t you accept the latter as an analogy for what Google cache does?
With all respect to you and others, I think that most security professionals assume too much. For example, you say, “No one is directly connected to the Internet in this fashion without making an effort to be so connected.” I’ll bet you that many users have no idea some files are connected to the Internet because they naively believe that only files they explicitly intend to make available are connected.
And of course, it is the individuals whose personal details are revealed and then misused who are the real victims, as far as I’m concerned. We can point fingers at those who fail to adequately secure files, but the problems wouldn’t be as severe if Google and search engines weren’t indexing and caching without explicit permission. Yes, I know I’m in the minority on this.
Thanks for sharing your thoughts. I learn from everyone’s comments.