Reverse engineering a $1B Legal AI tool exposed 100k+ confidential files
Recorded: Dec. 4, 2025, 3:05 a.m.
| Original | Summarized |
How I Reverse Engineered a Billion-Dollar Legal AI Tool and Found 100k+ Confidential Files | Alex Schapiro Alex Schapiro About How I Reverse Engineered a Billion-Dollar Legal AI Tool and Found 100k+ Confidential Files Dec 2, 2025 Update: This post received a large amount of attention on Hacker News — see the discussion thread. AI legal-tech companies are exploding in value, and Filevine, now valued at over a billion dollars, is one of the fastest-growing platforms in the space. Law firms feed tools like this enormous amounts of highly confidential information. I wanted to see what was actually loading, so I opened Chrome’s developer tools, but saw no Fetch/XHR requests (the request you often expect to see if a page is loading data). Then, I decided to dig through some of the Javascript files to see if I could figure out what was supposed to be happening. I saw a snippet in a JS file like POST await fetch(${BOX_SERVICE}/recommend). This piqued my interest – recommend what? And what is the BOX_SERVICE? That variable was not defined in the JS file the fetch would be called from, but (after looking through minified code, which SUCKS to do) I found it in another one: “dxxxxxx9.execute-api.us-west-2.amazonaws.com/prod”. Now I had a new endpoint to test, I just had to figure out the correct payload structure to it. After looking at more minified js to determine the correct structure for this endpoint, I was able to construct a working payload to /prod/recommend: (the name could be anything of course). No authorization tokens needed, and I was greeted with the response: At first I didn’t entirely understand the impact of what I saw. No matter the name of the project I passed in, I was recommended the same boxFolders and couldn’t seem to access any files. Then, not realizing I stumbled upon something massive, I turned my attention to the boxToken in the response. I immediately stopped testing and responsibly disclosed this to Filevine. They responded quickly and professionally and remediated this issue. Alex Schapiro Alex Schapiro bearsyankees aschap1 Security research, ethical hacking, and building stuff. |
How I Reverse Engineered a Billion-Dollar Legal AI Tool and Found 100k+ Confidential Files – Alex Schapiro **Summary** On December 3, 2025, Alex Schapiro published a technical blog post detailing a security vulnerability he discovered within Filevine, a rapidly growing AI-powered legal technology platform currently valued at over a billion dollars. The core of Schapiro’s findings revolves around a significant lack of security controls within Filevine’s demo environment, specifically through a subdomain called “margolis.filevine.com.” Utilizing techniques such as subdomain enumeration and analysis of JavaScript files, Schapiro was able to bypass authentication and gain unauthorized access to a recommendation service, identified as “BOX_SERVICE.” This service, in turn, granted the user a fully scoped, maximum access administrator token for the entire Filevine system, including access to a Box Filesystem equivalent, containing an estimated 100,000+ confidential files. The vulnerability stemmed from the demo environment's configuration resulting in the availability of an unrestricted admin token. Schapiro’s process began with a simple observation – the demo tool required affiliation with a law firm to operate, a common practice. However, recognizing the potential for a demo environment to be open, he employed subdomain enumeration. The discovery of “margolis.filevine.com” prompted a deeper investigation, focusing on the JavaScript files associated with the subdomain. He identified a POST request to “dxxxxxx9.execute-api.us-west-2.amazonaws.com/prod/recommend,” which he then attempted to interrogate. The absence of authentication tokens, coupled with the discovery of the “BOX_SERVICE” variable, led him to suspect a potentially sensitive operation. By analyzing further minified JavaScript files, he was able to determine the intended payload for this endpoint: {"projectName":"Very sensitive Project"}. Critically, he noted that no authentication tokens were required. The response revealed a Box token that provided full administrative access to the entire Filevine system. Further investigation, informed by his understanding of the Box API, revealed that this token granted near-unlimited access to confidential files, including internal memos, payrolls, and documents protected by court orders. The user’s ability to execute searches involving the term “confidential” yielded nearly 100,000 results, highlighting the vast scope of accessible data. Schapiro immediately ceased his testing and responsibly disclosed the vulnerability to Filevine. The company responded with a professional and swift remediation effort. The incident serves as a cautionary tale for companies racing to adopt AI technology. The report underscores the critical importance of robust security controls, proper authentication mechanisms, and thorough testing, particularly within demo environments, to prevent unauthorized access to sensitive client data. Failure to prioritize security when integrating powerful AI tools can create significant legal and reputational risks. Schapiro’s actions exemplify responsible disclosure and highlight the potential consequences of inadequate security practices within the rapidly developing environment of AI-powered legal tech. |