Researchers are often cautioned against uploading copyrighted research papers or participant data to cloud-based AI chatbots like ChatGPT. The concern is straightforward: AI models might use uploaded data for training, potentially leading to copyright infringement or violations of human-subject protections.
But are researchers unnecessarily avoiding powerful AI tools due to an oversimplified assumption that all Generative AI (GenAI) chatbots function the same way?
The reality is that not all AI chatbots handle data in the same manner. Some use uploaded data for model training, while others—particularly enterprise versions—do not. In this post, we examine how ChatGPT processes data across different service levels.
ChatGPT: More Than Just Storage
Unlike traditional cloud storage services like Dropbox, ChatGPT is not just a passive file repository—it is an active AI workspace designed to process and interact with uploaded data. When a researcher uploads a document or submits a prompt, the AI analyzes the content in real-time to generate meaningful responses.
Secure, AI-Powered Workshops
Imagine you rent a workshop inside an AI facility operated by OpenAI. The workshop is secure—only you have access unless you invite others in. But rather than just storing files, you need assistance processing them. To help, OpenAI provides AI-powered assistants (chatbot models) that can analyze your data, generate responses, and refine your work.
These AI assistants, however, do not have memory. When you turn them off, they forget everything they worked on. The level of security and data retention depends on the type of workspace you use.
Understanding Different AI Workspaces
Depending on the version of ChatGPT you use, you gain access to different types of AI workspaces, each with varying levels of data retention and privacy controls.
1. Monitored Workshop (Free-tier AI, e.g., ChatGPT Free, Deepseek Free, Google’s Gemini Free)
This workspace is monitored, allowing you to invite AI assistants (robots) from OpenAI to help process your data.
- AI Interaction: When active, the robots interact with your data, but once turned off, they don’t retain memory.
- Conversation History: Your data and logs remain in the workspace, accessible when you return, unless explicitly deleted by you.
- Monitoring & Data Use: By using this free-tier, you permit OpenAI to record interactions within the workshop (imagine a camera). Even if you delete your conversation history, OpenAI retains recordings, potentially indefinitely. Recordings can be reviewed by OpenAI’s engineers for model improvement or accessed to comply with legal obligations.
2. Cleanroom (Pro-tier AI with Temporary Chat Mode)
This workspace offers enhanced privacy for paying users. You have the option of using cleanrooms for private interactions.
- AI Interaction: AI assistants process your data actively during your session, but their memory is completely erased after you leave.
- Conversation History: No conversation logs or data history remain after you exit; every session starts from scratch. The room is completely wiped.
- Monitoring & Data Use: Security recordings are maintained, but strictly for security reasons and permanently deleted after 30 days. These recordings are not shared with engineers nor used for AI training. Access to recordings is strictly controlled and only provided under legal requirements.
3. Private AI Vault (Enterprise-tier AI, e.g., ChatGPT Teams & Enterprise)
This is a highly secure, private AI environment designed for sensitive research and enterprise use.
- AI Interaction: AI assistants actively interact with your data and maintain continuity across sessions through logs.
- Conversation History: You can maintain conversation logs, allowing you and AI assistants to pick up exactly where you left off.
- Monitoring & Data Use: There are still recordings placed in these workspaces. However, OpenAI does not use data from these vaults for training or improvement purposes. Data access is strictly regulated and permitted only under legal compulsion with clear protocols.
Here’s a summary:
| Feature | Monitored Workshop (Free-tier) | Cleanroom (Pro-tier, Temporary Chat) | Private AI Vault (Enterprise-tier) |
|---|---|---|---|
| AI Interaction | Real-time interaction; no retained memory | Real-time interaction; no retained memory | Real-time interaction; no retained memory |
| Conversation History | Logs persist unless explicitly deleted | No logs or conversation history; each session begins fresh | Logs persist unless explicitly deleted |
| Monitoring & Data Use | Conversations recorded indefinitely; OpenAI may use recordings to train and improve models or for legal compliance. | Security recordings retained for 30 days only; data never used for AI training. Access strictly limited to legal compliance. | No recordings for AI training. Data accessed strictly to provide service, enforce policies, or comply with law. |
Now Comes Memory
One of the most transformative recent updates to ChatGPT is the introduction of memory: a feature that fundamentally changes how the system can support continuity and personalization in research workflows.
Unlike traditional stateless interactions where each session starts fresh, memory allows ChatGPT to retain and recall information across chats, enabling more consistent, tailored responses. When memory is enabled (currently available for Pro and Enterprise users), the model can extract key information from your interactions, like your name, preferences, ongoing projects, or frequently requested formats… and store it in an external memory module, separate from the model’s internal weights or any conversation.
It’s important to note that stored information is not embedded in the model itself. Instead, memory functions as a structured layer of user-specific context that is dynamically reinserted into prompts when relevant. For example, if you previously asked ChatGPT to “remember that I work with SPSS and need APA-style output,” (and see a note this was added to your memories) future responses can automatically incorporate that preference without needing to restate it. Users are always notified when memory updates occur, and they retain full control: memory can be viewed, edited, or wiped entirely at any time.
Importantly, memory does not change how the model reasons. Instead, it shapes the starting context, helping the system stay aligned with your goals over time. For researchers, this means more efficient workflows, reduced repetition (e.g., “do not use emojis”, “use base R”)
You can learn more from OpenAI’s official memory documentation here.
Leave a comment