Enterprise AI Guardrails: How Companies Are Trying to Use ChatGPT Without Leaking Data
This article examines how enterprises are navigating the tension between ChatGPT's productivity benefits and data security risks six months after Samsung's cautionary leak incident. It explores evolving strategies including acceptable use policies, DLP adaptations, OpenAI's ChatGPT Enterprise offering, vendor security questions from legal teams, and private deployment options as organizations build practical guardrails enabling AI adoption without compromising confidential information.
9/11/20234 min read


Samsung's March 2023 incident became a cautionary tale that spread through every corporate legal and security team. Engineers pasted confidential source code into ChatGPT for debugging assistance and optimization suggestions. The code potentially entered OpenAI's training data. Three separate incidents in one month prompted Samsung to ban ChatGPT across the organization.
Six months later, enterprises are caught between two realities: employees find ChatGPT transformative for productivity, and corporate security policies classify it as an unacceptable data leak risk. The tension has spawned an emerging category of "enterprise AI guardrails"—technical controls, policies, and vendor solutions attempting to enable LLM usage without compromising data security.
The Core Fear: Where Does Data Go?
The security concern is straightforward. When employees paste information into ChatGPT, what happens to it? Until recently, OpenAI's default policy was that conversations could be used for model training unless users explicitly opted out. For enterprises, this was unacceptable—even one employee forgetting to opt out could leak confidential information into future model versions.
The nightmare scenarios keeping CISOs awake: proprietary source code appearing in model suggestions for competitors, confidential M&A information surfacing in responses to others, customer data or PII embedded in model training, trade secrets or strategic plans becoming inferable from model behavior, and legal documents with attorney-client privilege compromised.
Beyond training data, there are operational concerns. Even if data isn't used for training, it passes through OpenAI's servers. Does OpenAI log conversations? How long are logs retained? Who has access? What happens in a breach? These questions have concrete answers now, but initially created uncertainty that paralyzed enterprise adoption.
The Policy-First Response
Many organizations' first response was outright bans. JPMorgan Chase, Amazon, Apple, Verizon, and others blocked ChatGPT at the network level. Employees accessing it from corporate networks received blocked connection messages. Problem solved—from the security perspective.
But employees found workarounds immediately. Personal devices on cellular connections. Home usage. Unofficial accounts. The productivity gains were too compelling to ignore. Security teams found themselves in an unwinnable game of whack-a-mole, blocking tools while employees routed around restrictions.
Recognizing this reality, organizations shifted toward "acceptable use policies" rather than outright bans. These internal guidelines typically specify: never paste customer data, PII, or confidential information into public LLMs; use only approved enterprise instances with appropriate data protections; require opt-out of training data usage when using public versions; and obtain manager approval before using LLMs for sensitive work.
One Fortune 500 company's policy includes a helpful decision tree: "Is this information you'd feel comfortable posting on Twitter? If no, don't put it in ChatGPT." While oversimplified, it provides employees with an intuitive guideline.
The policy-only approach has obvious limitations. It relies entirely on employee judgment and compliance. One moment of inattention or ignorance can cause leaks. Organizations needed technical controls, not just policy.
Technical Controls: DLP and Filtering
Data Loss Prevention (DLP) tools are being adapted for LLM usage. Traditional DLP monitors emails, file transfers, and web uploads for sensitive data patterns—credit card numbers, social security numbers, or content marked confidential. Now DLP vendors are adding LLM-specific controls.
Netskope, Zscaler, and similar cloud security platforms now detect when employees access ChatGPT and can apply policies: warn users when they're on the site, block paste operations if content matches sensitive data patterns, allow usage but log all interactions for audit, or require approval workflows for certain content types.
The challenge is accuracy. DLP pattern matching generates false positives (blocking legitimate content) and false negatives (missing actual leaks). Code, for instance, is difficult—how does DLP distinguish between open-source examples safe to share and proprietary algorithms that aren't?
Some organizations implement "LLM usage forms" where employees request permission before using ChatGPT for specific projects, describing what data they'll use and receiving approval from security teams. This creates accountability but slows workflows significantly.
The Vendor Solution: Enterprise Agreements
OpenAI recognized enterprise concerns threatened adoption and responded with product and policy changes. In March, OpenAI introduced API data usage policies clarifying that API data isn't used for training—a critical distinction from the consumer ChatGPT product. In August, OpenAI launched ChatGPT Enterprise with explicit commitments: no conversation data used for training, SOC 2 compliance, admin controls for team management, and enhanced security and data handling.
These announcements shifted conversations in corporate legal and security teams. Rather than blanket bans, organizations could negotiate enterprise agreements with defined data handling terms. The questions legal teams now ask vendors are remarkably consistent:
Data usage: "Will our data be used for training? Under any circumstances?" Data residency: "Where is data stored geographically? Can we require specific regions?" Access controls: "Who within your organization can access our data? What audit trails exist?" Retention: "How long are conversations and prompts retained? Can we define retention periods?" Breach notification: "What are SLAs for notifying us of security incidents?" Compliance certifications: "Do you have SOC 2, ISO 27001, HIPAA compliance where applicable?" Termination and deletion: "When we terminate service, how is our data deleted? Can we verify deletion?"
Anthropic's enterprise offering includes similar commitments. Microsoft's Azure OpenAI Service gained traction specifically because it provides enterprise controls, allowing organizations to deploy GPT-4 within their Azure environments with contractually defined data handling.
Private Deployments and Self-Hosting
For organizations with strictest security requirements—financial institutions, healthcare, government contractors—even enterprise agreements don't suffice. They're exploring private deployments where models run entirely within their infrastructure.
Open-source models like Llama 2 enable this approach. Organizations can deploy models on their own servers, ensuring data never leaves their network. The tradeoff is capability—Llama 2 70B is competent but not GPT-4-level—and operational complexity. Running large models requires significant infrastructure and ML operations expertise.
Some enterprises are pursuing hybrid approaches: open-source models for highly sensitive work, enterprise API agreements for less sensitive applications. This balances security with capability.
The Emerging Consensus
By September 2023, patterns are crystallizing. Outright bans are recognized as ineffective and counterproductive. The emerging consensus approach combines multiple layers:
Clear policies defining acceptable use with practical guidelines. Enterprise vendor agreements with contractual data protections. Technical controls via DLP and cloud security tools monitoring usage. Employee education ensuring teams understand risks and appropriate usage. Approved alternatives providing sanctioned tools for various use cases.
Organizations are also recognizing that different use cases require different controls. Marketing content generation has different risk profiles than legal contract analysis. Tiered approaches allow broader access for lower-risk applications while restricting sensitive domains.
The challenge remains employee experience. Heavy-handed controls frustrate employees and reduce productivity gains LLMs promise. Finding the balance—enabling productivity while protecting data—is the enterprise AI challenge of 2023.
As tools mature and best practices spread, this balance will become easier. But for now, enterprises are navigating uncertain terrain, trying to harness transformative technology without creating catastrophic security risks. The guardrails are being built in real-time, learning from incidents and iterating rapidly. It's messy, imperfect, and absolutely necessary.

