The Good, the bad, and the ugly of Microsoft Edge’s autofill databases

Introduction

A recent investigation by our DFIR team into unauthorised data exposure brought to light a frequently overlooked security risk: the silent accumulation of highly sensitive information within internet browsers. While many users assume that autofill features only store basic details such as names and email addresses, the reality can be far more unsettling.

Our analysis discovered that Microsoft Edge, by default, automatically retained a range of sensitive data outside of its secure storage locations — including full credit card numbers, CVVs, passwords, national insurance numbers, payment details, HR leave requests, internal organisational feedback, responses to Microsoft Forms and even what they have asked ChatGPT. This type of information should be stored securely within the browser or just not recorded at all.

This discovery not only underlines the importance of scrutinising browser autofill data during digital forensic investigations and incident response engagements, but also highlights the serious risks to businesses that may have inadvertently breached internal policies or regulatory requirements. By understanding how these local repositories operate and the nature of the data they contain, security professionals can better mitigate risks, maintain compliance, and protect against both internal lapses and external threats.

Technical underpinnings of autofill data

When users fill out web forms, browsers often store the entered information locally so it can be automatically reinserted the next time a similar form is encountered all in the name of user convenience. Microsoft Edge, like many Chromium-based browsers, stores this form input data in local SQLite database named “Web Data” on the endpoint[1]. We could write an entire piece on that database alone however, the key tables we are going to focus on are:

autofill_edge_field_values:
- Records the raw data entered into form fields, including credentials, financial information, and personal identifiers.
- Tracks timestamps indicating when these entries were last used.
autofill_edge_field_client_info:
- Maps each form field to a specific domain and form signature.
- Provides labels and metadata that add context, helping analysts identify which applications or websites the data originated from.

By correlating these tables, analysts can reconstruct entire form submissions, identify patterns, and precisely determine when and where specific data was collected.

Beyond Names and Email Addresses

Common misconceptions hold that browser autofill only retains mundane contact details. In our recent investigation, our team uncovered a much broader and more sensitive scope of information, including:

Microsoft Forms Responses: Detailed answers to internal compliance surveys, performance reviews, and user satisfaction questionnaires.

**Figure 1.** Microsoft Form data – form was not submitted but still recorded.

HR Requests & Organisational Feedback: Confidential HR data, such as absence requests and managerial feedback forms, inadvertently logged by the browser.

**Figure 2.** “SurveyMonkey” free text box logged as form data.

Payment & Financial Details: Full credit card numbers, billing addresses, and other payment-related information, posing severe financial and regulatory risks.

**Figure 3.** “Getmybalance.com” recorded a 16-digit debit card number as form data (extracted from SQLite into CSV).

Credentials & National Insurance Numbers: Passwords, login pairs for corporate systems, and national insurance numbers, providing attackers with personal details and a foothold into critical infrastructure.

Figure 4. “registertovote.service.gov.uk” recorded national insurance number as form data (extracted from SQLite into CSV).

Customer Data from Web-Based Platforms: Sensitive customer details entered by employees into web forms — such as personally identifiable information, contact details, and account numbers — can end up stored locally, turning a simple convenience feature into a serious data protection concern.

We even noticed that data inputted into Generative AI platforms such as ChatGPT also gets recorded and can be correlated together into the conversation. This is particularly useful in understanding if company information is being input into these platforms.

**Figure 5.** “chatgpt.com” recorded data inputted into canvas fields as well as other conversations.

Why sensitive data ends up in autofill tables

This unexpected breadth of captured data likely originates from website construction and form design. Browsers like Edge rely on HTML attributes (e.g., type=”password”, autocomplete=”off”, or autocomplete=”cc-number”) and heuristic algorithms to determine which fields are sensitive. If a form developer does not label fields properly, fails to use recommended autocomplete attributes, or uses ambiguous field names, the browser may not recognise a field as sensitive. Without these signals, the browser treats the field as a regular input, storing and potentially autofilling data that should never be retained locally.

Over time, this leads to repositories of information — from credit card details to HR-related forms — that the organisation never intended to store in such an exposed format. The result is a dangerous stockpile of sensitive data hiding in plain sight.

The good: Forensic value

From a digital forensics and incident response investigations perspective, these autofill databases offer invaluable insights. Crucially, this data often persists even after a user clears their regular browsing history. Such persistence enables investigators to:

Reconstructing Timelines: By aligning date_created and date_last_used timestamps with known events, analysts can trace user or attacker or investigation subject activities and correlate them with suspicious incidents.
Attribution & Incident Scope: by joining the tables, detailed logs make it possible to determine precisely what information was entered, what website form it was entered into, which forms were targeted, and how data may have been exfiltrated.
User Behaviour Analysis: forms are another data source we can use to track a user’s activity and behaviours online not only where they’ve visited but how they’ve interacted with those sites. It provides insights where other sources might not be available.

In short, these records serve as silent witnesses that preserve a detailed account of user interactions long after other browser records have been purged.

Correlating data with SQL queries

By joining the two primary tables — autofill_edge_field_values and autofill_edge_field_client_info — we can gain a far more detailed view of user interactions. The autofill_edge_field_client_info table associates each entry with a domain, Field signature and form signature that is used by Edge to group fields belonging to the same logical form. Meanwhile, autofill_edge_field_values holds the actual data entered into these fields.

Form Signatures (form_signature_v1 and form_signature_v2) are unique identifiers the browser generates based on attributes of a webpage’s form, such as field order, naming, and layout. By hashing these characteristics, Edge can recognise when a user encounters a similar or identical form again, ensuring previously entered data can be reused in the right place. Essentially, form signatures allow the browser to identify and map groups of fields back to specific forms across visits.
Field Signatures (Field IDs) are assigned an internal identifier (field_id) by the browser, tying user-entered values to the precise input element they belong to. Linking field_id values from autofill_edge_field_values with their corresponding entries in autofill_edge_field_client_info helps analysts reconstruct the structure of the form and understand exactly which data was entered into which field.

**Figure 6.** Autofill data grouped in DB Browser for SQLite.

**Figure 7.** Grouped autofill data in JSON format.

The SQL query below merges these elements, not only providing the date when the data was last used but also consolidating labels and values into both a human-readable string and a JSON object. This structured output makes it easier for analysts to interpret the data and see how individual form fields relate to one another, the domain they came from, and when they were last relevant:

SELECT

autofill_edge_field_client_info.form_signature_v1,

autofill_edge_field_client_info.form_signature_v2,

autofill_edge_field_client_info.domain_value,

datetime(autofill_edge_field_values.date_last_used,’unixepoch’) AS date_last_used,

GROUP_CONCAT(autofill_edge_field_client_info.label || ‘: ‘ || autofill_edge_field_values.value, ‘, ‘) AS label_value_pairs,

json_group_object(autofill_edge_field_client_info.label, autofill_edge_field_values.value) AS label_value_json

FROM

autofill_edge_field_values

JOIN

autofill_edge_field_client_info

autofill_edge_field_values.field_id = autofill_edge_field_client_info.field_id

GROUP BY

autofill_edge_field_client_info.form_signature_v1,

autofill_edge_field_client_info.form_signature_v2,

autofill_edge_field_client_info.domain_value,

autofill_edge_field_values.date_last_used;

What this query provides analysts:

Contextual Understanding: Labels and domain associations add meaning to the raw values.
Temporal Clarity: Converting Unix epoch timestamps aids in placing events on an actionable timeline.
Structured Output: JSON and grouped outputs improve readability and facilitate integration with other tools.

This approach allows a single investigator to quickly review autofill data using a stand-alone SQLite browser such as DB Browser for SQLite, making it accessible even for smaller organisations.

Caveat: As noted earlier in the post, some of the correlation is performed at runtime. By grouping fields by time, we can gain a higher degree of confidence that certain entries are related; however, it is important to acknowledge that this method cannot guarantee a perfect one-to-one mapping in all cases.

Scaling forensic analysis with Velociraptor

For larger environments, manually examining each endpoint is time-consuming. Our recent investigation integrated this analysis into a series of Velociraptor Artefacts, enabling us to conduct an audit network wide. We are working with Velociraptor developers to include these in the SQLHunter artefact. the Query detailed in this blog is available under the source “Edge Browser Autofill_CombinedAutofill”.

GitHub – Velocidex/SQLiteHunter: Hunt for SQLite files used by various applications

The bad: A treasure trove for attackers

Infostealer malware such as RedLine[2] have become increasingly adept at targeting browser-stored data, including autofill entries. Once it gains a foothold on a machine, Infostealers can quickly harvest credentials, session tokens, and other sensitive information directly from the local SQLite databases used by browsers.

Attackers who have managed to infiltrate your network know the value of this data too. We have observed threat actors routinely dumping stored browser credentials, fully aware that administrators often keep highly privileged passwords for security tooling (e.g., EDR) and virtualisation platforms (e.g., vSphere). By capitalising on these repositories, attackers gain access to critical infrastructure and security management consoles, enabling them to escalate their privileges and pivot deeper into the environment with minimal effort.

The ugly: Sensitive data scattered across endpoints

Beyond its value to both investigators and attackers, perhaps the most unsettling facet of Edge’s autofill data is its quiet, uncontrolled proliferation across an organisation’s endpoints. Over time, sensitive information accumulates on user devices with little to no oversight or deliberate retention strategy, creating a veritable minefield of compliance, governance, and security issues:

Data Governance Nightmares: Well-intentioned corporate policies and regulations are easily undermined by browser autofill behaviour. Sensitive information that is supposed to remain strictly within secured applications and controlled data stores can end up copied across countless endpoints, stored in local SQLite databases well outside approved infrastructure. This uncontrolled sprawl of critical data stands in stark contrast to official data management plans, complicating enforcement and oversight.
Compliance and Privacy Risks: Regulatory frameworks such as GDPR, HIPAA, and PCI-DSS impose stringent rules on how sensitive data must be handled, stored, and safeguarded. Yet browser autofill databases—holding everything from financial details to HR records—may violate these standards if left unchecked. Should a breach occur, exposing personally identifiable information (PII) or regulated data lurking in autofill stores, the organisation faces not only fines and legal consequences but also severe reputational damage.
Lack of Central Visibility & Control: While IT and security teams diligently monitor sanctioned repositories and systems, browser-stored autofill data often flies under the radar. Without targeted audits or the deployment of endpoint- or browser-level Data Loss Prevention (DLP) solutions, these hidden caches remain invisible. The result is a significant blind spot in an organisation’s security posture—critical data may sit unencrypted, uncontrolled, and easily accessible on end-user devices, waiting to be discovered by opportunistic attackers.

In essence, what started as a convenient feature to streamline user input can balloon into a systemic risk. Sensitive data that was never meant to leave secure platforms winds up scattered across laptops, desktops, and shared terminals. This silent, unregulated sprawl of valuable information epitomises the “ugly” side of autofill databases: a hidden threat that inflates the organisation’s attack surface, imperils compliance efforts, and complicates the fundamental tenets of data governance.

Mitigations and preventative measures

For Security Teams & IT Administrators:

Centralised Browser Control: Enforce the use of managed browsers, such as Edge for Business or Chrome for Enterprise, and deny the use of any unauthorised browsers. With centralised control, IT teams can disable autofill for sensitive domains or ensure data is cleared upon exit.
Strict Data Governance Policies: Educate users and enforce policies prohibiting the entry of passwords, credit card details, national insurance numbers, and other high-risk data into browser forms.
Regular Auditing & Sanitisation: Conduct scheduled audits (via Velociraptor or similar tools) to identify and purge sensitive autofill and login data, maintaining a cleaner security posture.
User Behaviour Analysis: Evaluating the types of forms commonly autofilled helps guide future training and policy adjustments, ensuring that sensitive information is not casually typed into insecure fields it may even inform what third party sites you use, if they are known to be poorly written.

For Developers:

Use standard HTML attributes: Attributes like autocomplete=”off” or autocomplete=”new-password” are recommended by W3C to guide browsers to record form data in the correct way.
Follow Semantic Markup: Clear, Standards based form structures and proper labelling help browsers like Edge interpret and store form data correctly.

For Users:

Disable Autofill: As a proactive measure, you can disable autofill payments and navigate to your Browser settings
Be Discerning: Consider carefully before letting the browser remember sensitive details. The convenience must not come at the cost of security and privacy.
Adopt Dedicated Password Managers: Use secure password managers rather than relying on browser autofill for credentials. These specialised tools offer better encryption and controlled access, thereby reducing risk.

Conclusion

The discovery of highly sensitive data accumulating in Microsoft Edge’s autofill databases highlights a truth we’ve known within DFIR circles for years, convenience features can become security liabilities without proper oversight. Due to poor website construction and insufficient HTML attributes, the browser may fail to identify certain fields as sensitive, inadvertently storing data that should never have been retained. From a forensic perspective, these data stores persist beyond normal browsing history deletions, enabling detailed reconstructions of user activity. From an attacker’s standpoint, they represent a lucrative cache.

By enforcing centralised browser control, improving website form design, educating users, and regularly auditing endpoints, organisations can ensure that convenience does not overshadow security. Armed with this understanding, stakeholders can strike a more informed balance between usability and protection, mitigating the inherent risks that arise when sensitive data is left exposed in plain sight.

To do

As this is only one piece of the puzzle, our future work will extend beyond Microsoft Edge, with planned examinations of Chrome, Firefox, and other leading browsers. By broadening our scope, we aim to provide a more complete understanding of how autofill data is managed, mismanaged, and sometimes weaponised across today’s most widely used web platforms.

[1] %USERPROFILE%\AppData\Local\Microsoft\Edge\User Data\Default\Web Data

[2] Safeguarding Against Silent Cyber Threats: Exploring the Stealer Log Lifecycle

Find out more about our Incident Response services

Interested in improving your incident response capabilities? Reach out to learn how our expert services can support your security needs. Click here to speak with our team.

Need some emergency support? Call our emergency hotline: +44(0)2038729052

About the author

Chris Hayes, Head of Incident Response at Reliance Cyber

Chris possesses over 12 years of experience across a series of Cyber roles within both the private public sector. As ex-military, Chris has responded to some of the largest and most high-profile cyber-attacks in addition to tracking sophisticated nation state threat actor groups.

Technical Blog