Technical underpinnings of autofill data

When users fill out web forms, browsers often store the entered information locally so it can be automatically reinserted the next time a similar form is encountered all in the name of user convenience. Microsoft Edge, like many Chromium-based browsers, stores this form input data in local SQLite database named “Web Data” on the endpoint[1]. We could write an entire piece on that database alone however, the key tables we are going to focus on are:

  1. autofill_edge_field_values:
    • Records the raw data entered into form fields, including credentials, financial information, and personal identifiers.
    • Tracks timestamps indicating when these entries were last used.
  2. autofill_edge_field_client_info:
    • Maps each form field to a specific domain and form signature.
    • Provides labels and metadata that add context, helping analysts identify which applications or websites the data originated from.

By correlating these tables, analysts can reconstruct entire form submissions, identify patterns, and precisely determine when and where specific data was collected.

Common misconceptions hold that browser autofill only retains mundane contact details. In our recent investigation, our team uncovered a much broader and more sensitive scope of information, including:

  • Microsoft Forms Responses: Detailed answers to internal compliance surveys, performance reviews, and user satisfaction questionnaires.
Figure 1. Microsoft Form data – form was not submitted but still recorded.
  • HR Requests & Organisational Feedback: Confidential HR data, such as absence requests and managerial feedback forms, inadvertently logged by the browser.
Figure 2. “SurveyMonkey” free text box logged as form data.
  • Payment & Financial Details: Full credit card numbers, billing addresses, and other payment-related information, posing severe financial and regulatory risks.
Figure 3. “Getmybalance.com” recorded a 16-digit debit card number as form data (extracted from SQLite into CSV).
  • Credentials & National Insurance Numbers: Passwords, login pairs for corporate systems, and national insurance numbers, providing attackers with personal details and a foothold into critical infrastructure.
Figure 4. “registertovote.service.gov.uk” recorded national insurance number as form data (extracted from SQLite into CSV).
  • Customer Data from Web-Based Platforms: Sensitive customer details entered by employees into web forms — such as personally identifiable information, contact details, and account numbers — can end up stored locally, turning a simple convenience feature into a serious data protection concern.

We even noticed that data inputted into Generative AI platforms such as ChatGPT also gets recorded and can be correlated together into the conversation. This is particularly useful in understanding if company information is being input into these platforms.

Figure 5. “chatgpt.com” recorded data inputted into canvas fields as well as other conversations.

Why sensitive data ends up in autofill tables

This unexpected breadth of captured data likely originates from website construction and form design. Browsers like Edge rely on HTML attributes (e.g., type=”password”, autocomplete=”off”, or autocomplete=”cc-number”) and heuristic algorithms to determine which fields are sensitive. If a form developer does not label fields properly, fails to use recommended autocomplete attributes, or uses ambiguous field names, the browser may not recognise a field as sensitive. Without these signals, the browser treats the field as a regular input, storing and potentially autofilling data that should never be retained locally.

Over time, this leads to repositories of information — from credit card details to HR-related forms — that the organisation never intended to store in such an exposed format. The result is a dangerous stockpile of sensitive data hiding in plain sight.

The good: Forensic value

From a digital forensics and incident response investigations perspective, these autofill databases offer invaluable insights. Crucially, this data often persists even after a user clears their regular browsing history. Such persistence enables investigators to:

  • Reconstructing Timelines: By aligning date_created and date_last_used timestamps with known events, analysts can trace user or attacker or investigation subject activities and correlate them with suspicious incidents.
  • Attribution & Incident Scope: by joining the tables, detailed logs make it possible to determine precisely what information was entered, what website form it was entered into, which forms were targeted, and how data may have been exfiltrated.
  • User Behaviour Analysis: forms are another data source we can use to track a user’s activity and behaviours online not only where they’ve visited but how they’ve interacted with those sites. It provides insights where other sources might not be available.

In short, these records serve as silent witnesses that preserve a detailed account of user interactions long after other browser records have been purged.

Correlating data with SQL queries

By joining the two primary tables — autofill_edge_field_values and autofill_edge_field_client_info — we can gain a far more detailed view of user interactions. The autofill_edge_field_client_info table associates each entry with a domain, Field signature and form signature that is used by Edge to group fields belonging to the same logical form. Meanwhile, autofill_edge_field_values holds the actual data entered into these fields.

  • Form Signatures (form_signature_v1 and form_signature_v2) are unique identifiers the browser generates based on attributes of a webpage’s form, such as field order, naming, and layout. By hashing these characteristics, Edge can recognise when a user encounters a similar or identical form again, ensuring previously entered data can be reused in the right place. Essentially, form signatures allow the browser to identify and map groups of fields back to specific forms across visits.
  • Field Signatures (Field IDs) are assigned an internal identifier (field_id) by the browser, tying user-entered values to the precise input element they belong to. Linking field_id values from autofill_edge_field_values with their corresponding entries in autofill_edge_field_client_info helps analysts reconstruct the structure of the form and understand exactly which data was entered into which field.
Figure 6. Autofill data grouped in DB Browser for SQLite.
Figure 7. Grouped autofill data in JSON format.

The SQL query below merges these elements, not only providing the date when the data was last used but also consolidating labels and values into both a human-readable string and a JSON object. This structured output makes it easier for analysts to interpret the data and see how individual form fields relate to one another, the domain they came from, and when they were last relevant:

What this query provides analysts:

  • Contextual Understanding: Labels and domain associations add meaning to the raw values.
  • Temporal Clarity: Converting Unix epoch timestamps aids in placing events on an actionable timeline.
  • Structured Output: JSON and grouped outputs improve readability and facilitate integration with other tools.

This approach allows a single investigator to quickly review autofill data using a stand-alone SQLite browser such as DB Browser for SQLite, making it accessible even for smaller organisations.

Caveat: As noted earlier in the post, some of the correlation is performed at runtime. By grouping fields by time, we can gain a higher degree of confidence that certain entries are related; however, it is important to acknowledge that this method cannot guarantee a perfect one-to-one mapping in all cases.

Scaling forensic analysis with Velociraptor

For larger environments, manually examining each endpoint is time-consuming. Our recent investigation integrated this analysis into a series of Velociraptor Artefacts, enabling us to conduct an audit network wide. We are working with Velociraptor developers to include these in the SQLHunter artefact. the Query detailed in this blog is available under the source “Edge Browser Autofill_CombinedAutofill”.

GitHub – Velocidex/SQLiteHunter: Hunt for SQLite files used by various applications

The bad: A treasure trove for attackers

Infostealer malware such as RedLine[2] have become increasingly adept at targeting browser-stored data, including autofill entries. Once it gains a foothold on a machine, Infostealers can quickly harvest credentials, session tokens, and other sensitive information directly from the local SQLite databases used by browsers.

Attackers who have managed to infiltrate your network know the value of this data too. We have observed threat actors routinely dumping stored browser credentials, fully aware that administrators often keep highly privileged passwords for security tooling (e.g., EDR) and virtualisation platforms (e.g., vSphere). By capitalising on these repositories, attackers gain access to critical infrastructure and security management consoles, enabling them to escalate their privileges and pivot deeper into the environment with minimal effort.

Mitigations and preventative measures

  • Centralised Browser Control: Enforce the use of managed browsers, such as Edge for Business or Chrome for Enterprise, and deny the use of any unauthorised browsers. With centralised control, IT teams can disable autofill for sensitive domains or ensure data is cleared upon exit.
  • Strict Data Governance Policies: Educate users and enforce policies prohibiting the entry of passwords, credit card details, national insurance numbers, and other high-risk data into browser forms.
  • Regular Auditing & Sanitisation: Conduct scheduled audits (via Velociraptor or similar tools) to identify and purge sensitive autofill and login data, maintaining a cleaner security posture.
  • User Behaviour Analysis: Evaluating the types of forms commonly autofilled helps guide future training and policy adjustments, ensuring that sensitive information is not casually typed into insecure fields it may even inform what third party sites you use, if they are known to be poorly written.
  • Use standard HTML attributes: Attributes like autocomplete=”off” or autocomplete=”new-password” are recommended by W3C to guide browsers to record form data in the correct way.
  • Follow Semantic Markup: Clear, Standards based form structures and proper labelling help browsers like Edge interpret and store form data correctly.
  • Disable Autofill: As a proactive measure, you can disable autofill payments and navigate to your Browser settings
  • Be Discerning: Consider carefully before letting the browser remember sensitive details. The convenience must not come at the cost of security and privacy.
  • Adopt Dedicated Password Managers: Use secure password managers rather than relying on browser autofill for credentials. These specialised tools offer better encryption and controlled access, thereby reducing risk.


[1] %USERPROFILE%\AppData\Local\Microsoft\Edge\User Data\Default\Web Data

[2] Safeguarding Against Silent Cyber Threats: Exploring the Stealer Log Lifecycle


Find out more about our Incident Response services

Interested in improving your incident response capabilities? Reach out to learn how our expert services can support your security needs. Click here to speak with our team.

Need some emergency support? Call our emergency hotline: +44(0)2038729052

About the author

Picture of Chris Hayes - Head of IR

Chris Hayes, Head of Incident Response at Reliance Cyber

Chris possesses over 12 years of experience across a series of Cyber roles within both the private public sector. As ex-military, Chris has responded to some of the largest and most high-profile cyber-attacks in addition to tracking sophisticated nation state threat actor groups.