Idea Details - IdeaForge

Original Idea

Market Analysis

Top Competitors

Uncheck.ai: Focuses on making AI-generated content undetectable by AI detectors, aiming for a human-like writing style. Offers different modes to bypass AI detection.
Rewritify: An AI paraphrasing tool that helps rewrite AI content to sound more human, aiming to bypass AI detection tools like GPTZero and Originality.ai.
Copyleaks: Provides an AI content detector that identifies AI-generated text in various content types. Focuses on accuracy and provides insights for content improvement.

Market Gaps / Complaints

Existing AI detection tools produce false positives, labeling human-written content as AI-generated, which leads to inaccurate accusations, especially in academic settings.
Current detectors often lack detailed analysis, simply flagging content as AI without providing specific insights or guidance on how to improve the text.
Many undetectable AI tools focus on bypassing AI detection without necessarily improving the quality or clarity of the original AI-generated content.

Unique Selling Points

Precise AI Fingerprinting: Beyond Binary Detection

Deep Scan goes beyond simple 'AI' or 'Human' classifications. Our advanced algorithms identify specific linguistic patterns associated with AI-generated text, providing a nuanced analysis of potential AI influence, minimizing false positives and empowering users with accurate insights.

Actionable Feedback: Understand WHY and HOW to Improve

Unlike basic AI detectors, Deep Scan pinpoints specific areas for improvement: sentence structure, repetition, tone shifts, and more. We provide actionable suggestions to refine your writing, ensuring it reads naturally and authentically, regardless of its origin.

Human-Quality Assurance: Craft Authentic Content with Confidence

Whether AI-assisted or entirely original, Deep Scan helps you polish your text to meet the highest standards of human writing. Eliminate AI tells and ensure your content resonates with readers, building trust and credibility with every publication. Publish with absolute confidence.

Feature Breakdown

Text Input and AI Detection

Allows users to input text and initiates the core AI detecti...

AI Likelihood Score

Provides a score (e.g., percentage) indicating the likelihoo...

Structural Analysis Module

Analyzes sentence structure, identifies repetitive patterns,...

Repetition Detection

Identifies repeated words, phrases, and sentence structures,...

Tone Variation Analysis

Detects shifts in tone and style within the text, flagging p...

Highlighting and Suggestion Interface

Presents the analyzed text with highlighted areas of concern...

Export Functionality

Allows users to export the analyzed text with highlighted ar...

Basic User Authentication

Secure login and registration functionality. Allows users to...

Master Coding Prompt

Customize Your Prompt

Tech Stack (Preferred)

Technologies to Avoid

App Language

Specific User Instructions

Final Output

```markdown
# Cursor/Windsurf Master Prompt: Deep Scan MVP

This prompt outlines the steps to build a Minimum Viable Product (MVP) for "Deep Scan," an AI detection and writing improvement tool. Deep Scan analyzes text, identifies potential AI-generated content indicators (structural issues, repetition, tone shifts), and provides actionable suggestions for improvement.

## 1. Project Setup

**Folder Structure:**

```
deepscan/
├── app/
│   ├── __init__.py
│   ├── models.py  # Database models
│   ├── routes.py  # Flask routes/controllers
│   ├── ai_detection.py # Core AI Detection logic
│   ├── structural_analysis.py
│   ├── repetition_detection.py
│   ├── tone_analysis.py
│   ├── utils.py     # Utility functions
│   ├── forms.py     # WTForms for input validation
│   └── templates/   # Jinja2 templates
│       ├── base.html
│       ├── index.html
│       ├── analysis_result.html
│       ├── login.html
│       └── register.html
├── tests/
│   ├── __init__.py
│   ├── test_ai_detection.py
│   ├── test_structural_analysis.py
│   # ... other tests
├── migrations/      # Alembic migrations (if using SQLAlchemy)
├── venv/            # Virtual environment (created later)
├── Dockerfile
├── docker-compose.yml
├── requirements.txt
├── config.py        # Application configuration
└── README.md
```

**Dependencies (requirements.txt):**

```
Flask==3.0.0
Flask-SQLAlchemy==3.1.1
Flask-Migrate==4.0.7
Flask-WTF==1.2.1
Werkzeug==3.0.1 #Pin to prevent conflicts
python-dotenv==1.0.0
scikit-learn==1.4.0 # For machine learning models
nltk==3.9 # For Natural Language Processing
transformers==4.35.2 # For advanced text analysis models like BERT
torch==2.1.2 #PyTorch for Transformers
beautifulsoup4==4.12.2 # For HTML parsing
bcrypt==4.1.2 # Password Hashing
flasgger==0.9.5 #API documentation
```

## 2. Tech Stack Definition

*   **Backend Framework:** Flask (Python)
*   **Database:** SQLite (for MVP, easily switchable to PostgreSQL later)
*   **ORM:** Flask-SQLAlchemy
*   **Migrations:** Flask-Migrate (Alembic)
*   **Forms:** Flask-WTF
*   **AI Libraries:**
    *   `scikit-learn`: For traditional machine learning models.
    *   `nltk`: For natural language processing tasks (tokenization, stemming, sentiment analysis).
    *   `transformers` (Hugging Face): For more advanced AI detection using pre-trained models like BERT or RoBERTa.
    *   `torch`: PyTorch framework for the transformers library.
*   **Environment Variables:** `python-dotenv`
*   **Containerization:** Docker, Docker Compose
*   **API Documentation:** `flasgger`

## 3. Database Initial Schema (models.py)

```python
from flask_sqlalchemy import SQLAlchemy
from flask_login import UserMixin
from werkzeug.security import generate_password_hash, check_password_hash

db = SQLAlchemy()

class User(UserMixin, db.Model):
    id = db.Column(db.Integer, primary_key=True)
    username = db.Column(db.String(80), unique=True, nullable=False)
    email = db.Column(db.String(120), unique=True, nullable=False)
    password_hash = db.Column(db.String(128))
    analyses = db.relationship('Analysis', backref='user', lazy=True)

    def set_password(self, password):
        self.password_hash = generate_password_hash(password)

    def check_password(self, password):
        return check_password_hash(self.password_hash, password)

class Analysis(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    text = db.Column(db.Text, nullable=False)
    ai_likelihood = db.Column(db.Float)
    analysis_result = db.Column(db.Text) # JSON string of the analysis details
    user_id = db.Column(db.Integer, db.ForeignKey('user.id'), nullable=False)
    timestamp = db.Column(db.DateTime, default=db.func.now())

    def __repr__(self):
        return f'<Analysis {self.id}>'

```

## 4. Feature Implementation Steps

**A. Project Setup & User Authentication:**

1.  **Create Virtual Environment:** `python -m venv venv`
2.  **Activate Virtual Environment:** `source venv/bin/activate` (Linux/macOS) or `venv\Scripts\activate` (Windows)
3.  **Install Dependencies:** `pip install -r requirements.txt`
4.  **Configure Flask App (config.py):**

    ```python
    import os
    from dotenv import load_dotenv

    load_dotenv()

    class Config:
        SECRET_KEY = os.environ.get('SECRET_KEY') or 'your_secret_key'  # Change this in production!
        SQLALCHEMY_DATABASE_URI = os.environ.get('DATABASE_URL') or 'sqlite:///site.db'
        SQLALCHEMY_TRACK_MODIFICATIONS = False # reduce memory usage
    ```

5.  **Initialize Flask App (app/\_\_init\_\_.py):**

    ```python
    from flask import Flask
    from flask_sqlalchemy import SQLAlchemy
    from flask_migrate import Migrate
    from flask_login import LoginManager
    from config import Config
    from flasgger import Swagger

    db = SQLAlchemy()
    migrate = Migrate()
    login_manager = LoginManager()
    login_manager.login_view = 'auth.login'  # Where to redirect if not logged in
    login_manager.login_message_category = 'info' # Bootstrap class for message

    def create_app(config_class=Config):
        app = Flask(__name__)
        app.config.from_object(config_class)

        db.init_app(app)
        migrate.init_app(app, db)
        login_manager.init_app(app)

        from app.routes import bp as main_bp
        app.register_blueprint(main_bp)

        from app.auth import bp as auth_bp
        app.register_blueprint(auth_bp, url_prefix='/auth')

        Swagger(app)

        return app

    from app import models
    ```

6.  **Create Authentication Blueprint (app/auth.py, app/forms.py, app/templates/login.html, register.html):** Implement login and registration routes and forms using Flask-WTF, Flask-Login, and `bcrypt` for password hashing.

    *   **app/auth.py:** Handles user registration, login, and logout.
    *   **app/forms.py:** Defines forms for registration and login with appropriate validation.
    *   **app/templates/login.html, register.html:** Basic HTML templates for the login and registration pages.  (Focus on functional, not highly styled initially.)

7.  **User Loader (app/\_\_init\_\_.py):**

    ```python
    from app.models import User

    @login_manager.user_loader
    def load_user(user_id):
        return User.query.get(int(user_id))
    ```

**B. Core AI Detection and Analysis:**

1.  **Text Input and Processing (app/routes.py, app/templates/index.html):**

    *   Create a route that accepts text input (either pasted or uploaded from a file using Flask-WTF FileField).
    *   Sanitize the input text.

2.  **AI Likelihood Score (app/ai_detection.py):**

    *   Implement a function `detect_ai(text)` that returns a probability score (0-1) indicating the likelihood of AI generation.
    *   **MVP Implementation:** Start with a simple model, such as training a Naive Bayes classifier or Logistic Regression model on a dataset of human-written and AI-generated texts.
        *   Use features like:
            *   Burstiness (variance of sentence length).
            *   Perplexity (using a language model).
            *   Frequency of common AI-generated phrases.
    *   **Future Enhancement:**  Integrate with a pre-trained transformer model (BERT, RoBERTa, etc.) fine-tuned for AI detection. This will significantly improve accuracy.  Use `transformers` library.
    *   **Example using a basic model:**

        ```python
        # app/ai_detection.py
        from sklearn.feature_extraction.text import TfidfVectorizer
        from sklearn.linear_model import LogisticRegression
        from sklearn.model_selection import train_test_split
        import joblib  # For saving the model

        # Sample data (replace with a larger, representative dataset)
        human_texts = ["This is written by a human.", "Another human sentence."]
        ai_texts = ["This was generated by AI.", "AI wrote this."]
        labels = [0] * len(human_texts) + [1] * len(ai_texts)  # 0: Human, 1: AI
        texts = human_texts + ai_texts

        # Train the model
        vectorizer = TfidfVectorizer()
        X = vectorizer.fit_transform(texts)
        X_train, X_test, y_train, y_test = train_test_split(X, labels, test_size=0.2)

        model = LogisticRegression()
        model.fit(X_train, y_train)

        # Save the model and vectorizer
        joblib.dump(model, 'ai_model.joblib')
        joblib.dump(vectorizer, 'tfidf_vectorizer.joblib')

        def detect_ai(text):
            """Detects the likelihood of AI-generated text."""
            loaded_model = joblib.load('ai_model.joblib')
            loaded_vectorizer = joblib.load('tfidf_vectorizer.joblib')
            text_vectorized = loaded_vectorizer.transform([text])
            probability = loaded_model.predict_proba(text_vectorized)[0][1]  # Probability of AI
            return probability
        ```
        Make sure to also include the model files in the deepscan directory or change the path as needed

3.  **Structural Analysis (app/structural_analysis.py):**

    ```python
    # app/structural_analysis.py
    import nltk
    from nltk.tokenize import sent_tokenize, word_tokenize
    from nltk.corpus import stopwords

    nltk.download('punkt')
    nltk.download('stopwords')

    def analyze_structure(text):
        """Analyzes sentence structure and identifies potential issues."""
        sentences = sent_tokenize(text)
        results = []
        for i, sentence in enumerate(sentences):
            words = word_tokenize(sentence)
            stop_words = set(stopwords.words('english'))
            filtered_words = [w for w in words if not w in stop_words]
            if len(filtered_words) < 5:  # Simple check for short sentences
                results.append({"sentence_index": i, "issue": "Very short sentence", "suggestion": "Consider combining with another sentence."})
            # Add more sophisticated analysis here in future iterations.  Look for passive voice, complex clauses, etc.
        return results
    ```

4.  **Repetition Detection (app/repetition_detection.py):**

    ```python
    # app/repetition_detection.py
    from collections import Counter

    def detect_repetition(text):
        """Identifies repeated words and phrases."""
        words = text.lower().split()
        word_counts = Counter(words)
        repeated_words = {word: count for word, count in word_counts.items() if count > 2} #threshold of 2 to avoid common words.
        return repeated_words
    ```

5.  **Tone Variation Analysis (app/tone_analysis.py):**

    ```python
    # app/tone_analysis.py
    from nltk.sentiment.vader import SentimentIntensityAnalyzer
    import nltk
    nltk.download('vader_lexicon')

    def analyze_tone(text):
        """Detects shifts in tone and style."""
        sid = SentimentIntensityAnalyzer()
        sentences = nltk.sent_tokenize(text)
        tone_shifts = []
        previous_score = None
        for i, sentence in enumerate(sentences):
            scores = sid.polarity_scores(sentence)
            compound_score = scores['compound']
            if previous_score is not None:
                delta = abs(compound_score - previous_score)
                if delta > 0.5:  # Adjust threshold as needed
                    tone_shifts.append({"sentence_index": i, "delta": delta})
            previous_score = compound_score
        return tone_shifts
    ```

6.  **Integrate Analysis Modules (app/routes.py):** Call the analysis functions from the route that handles text input. Store the results.

7.  **Highlighting and Suggestion Interface (app/templates/analysis_result.html):**

    *   Render the analyzed text in a template.
    *   Use HTML and CSS to highlight areas of concern (structural issues, repetition, tone shifts) based on the results from the analysis modules.  Provide suggestions for improvement as tooltips or inline.
    *   Example:

        ```html
        <!-- app/templates/analysis_result.html -->
        <p>
        {% for i, word in enumerate(text.split()) %}
            {% if i in highlighted_indices %}
                <span class="highlighted" title="{{ suggestions[i] }}">{{ word }}</span>
            {% else %}
                {{ word }}
            {% endif %}
        {% endfor %}
        </p>
        ```

8.  **Export Functionality (app/routes.py):**

    *   Create a route that allows users to download the analyzed text with highlighting and suggestions in a .txt or .docx file.

**C. API Documentation**

1. Incorporate flasgger for API documentation of the backend endpoints

**D. Dockerization**

1.  **Dockerfile:**

    ```dockerfile
    FROM python:3.9-slim-buster

    WORKDIR /app

    COPY requirements.txt .
    RUN pip install --no-cache-dir -r requirements.txt

    COPY . .

    EXPOSE 5000

    CMD ["python", "run.py"]
    ```

2.  **docker-compose.yml:**

    ```yaml
    version: "3.8"
    services:
      web:
        build: .
        ports:
          - "5000:5000"
        volumes:
          - .:/app
        environment:
          - FLASK_APP=run.py
          - FLASK_ENV=development
          - SECRET_KEY=your_secret_key # Change this!
          - SQLALCHEMY_DATABASE_URI=sqlite:///site.db
        depends_on:
          #Add any other services here if more are created
          []
    ```

## 5. UI/UX Guidelines

*   **Clean and Simple Interface:**  Focus on readability and ease of use. Avoid clutter.
*   **Clear Visual Hierarchy:** Use typography and spacing to guide the user's eye.
*   **Intuitive Highlighting:** Use distinct colors or styling to highlight different types of issues (structural, repetition, tone).
*   **Actionable Suggestions:** Provide clear and concise suggestions for improvement.  Consider allowing users to "accept" suggestions directly in the interface.
*   **Mobile Responsive:** The application should be usable on different screen sizes.
*   **Focus on Confidence:** The primary goal is to allow users to publish with confidence by improving trust in the content.

## 6. Instructions

1.  **Clone the repository:** `git clone <your_repository_url>`
2.  **Navigate to the project directory:** `cd deepscan`
3.  **Create and activate the virtual environment:** `python -m venv venv` and `source venv/bin/activate` (Linux/macOS) or `venv\Scripts\activate` (Windows)
4.  **Install dependencies:** `pip install -r requirements.txt`
5.  **Initialize the database:**

    ```bash
    flask db init
    flask db migrate -m "Initial migration"
    flask db upgrade
    ```

6.  **Run the application:** `python run.py` (Create a `run.py` file with `from app import create_app; app = create_app(); app.run(debug=True)`)
7.  **Docker Deployment:**
    *   `docker-compose build`
    *   `docker-compose up`
    *   Access the app at `http://localhost:5000`

## Development Iterations

1. **MVP**:  Focus on core functionality - text input, basic AI detection (using a simple model), structural analysis, repetition detection, and basic highlighting.  User authentication is a must.
2. **Advanced AI Detection**: Integrate a pre-trained transformer model (BERT, RoBERTa) for more accurate AI detection. Fine-tune the model on a large dataset of human-written and AI-generated texts.
3. **Improved UI/UX**:  Enhance the user interface with better highlighting, more detailed suggestions, and a more intuitive workflow. Add user preferences (e.g., toggle highlighting, customize analysis parameters).
4.  **Enhanced Analysis Modules:**  Add more sophisticated analysis techniques (e.g., deeper structural analysis, plagiarism detection).
5.  **Collaboration Features:** Allow users to share analyses and collaborate on improving text.

This prompt provides a comprehensive roadmap for building the Deep Scan MVP. Remember to prioritize core features, iterate based on user feedback, and continuously improve the accuracy and usability of the application. Good luck!
```