Challenge 1b: Multi-Collection PDF Analysis

Overview

Advanced PDF analysis solution that processes multiple document collections and extracts relevant content based on specific personas and use cases.

Project Structure

Challenge_1b/
├── Collection 1/                    # Travel Planning
│   ├── PDFs/                       # South of France guides
│   ├── challenge1b_input.json      # Input configuration
│   └── challenge1b_output.json     # Analysis results
├── Collection 2/                    # Adobe Acrobat Learning
│   ├── PDFs/                       # Acrobat tutorials
│   ├── challenge1b_input.json      # Input configuration
│   └── challenge1b_output.json     # Analysis results
├── Collection 3/                    # Recipe Collection
│   ├── PDFs/                       # Cooking guides
│   ├── challenge1b_input.json      # Input configuration
│   └── challenge1b_output.json     # Analysis results
└── README.md

Collections

Collection 1: Travel Planning

Challenge ID: round_1b_002
Persona: Travel Planner
Task: Plan a 4-day trip for 10 college friends to South of France
Documents: 7 travel guides

Collection 2: Adobe Acrobat Learning

Challenge ID: round_1b_003
Persona: HR Professional
Task: Create and manage fillable forms for onboarding and compliance
Documents: 15 Acrobat guides

Collection 3: Recipe Collection

Challenge ID: round_1b_001
Persona: Food Contractor
Task: Prepare vegetarian buffet-style dinner menu for corporate gathering
Documents: 9 cooking guides

Input/Output Format

Input JSON Structure

{
  "challenge_info": {
    "challenge_id": "round_1b_XXX",
    "test_case_name": "specific_test_case"
  },
  "documents": [{"filename": "doc.pdf", "title": "Title"}],
  "persona": {"role": "User Persona"},
  "job_to_be_done": {"task": "Use case description"}
}

Output JSON Structure

{
  "metadata": {
    "input_documents": ["list"],
    "persona": "User Persona",
    "job_to_be_done": "Task description"
  },
  "extracted_sections": [
    {
      "document": "source.pdf",
      "section_title": "Title",
      "importance_rank": 1,
      "page_number": 1
    }
  ],
  "subsection_analysis": [
    {
      "document": "source.pdf",
      "refined_text": "Content",
      "page_number": 1
    }
  ]
}

Key Features

Persona-based content analysis
Importance ranking of extracted sections
Multi-collection document processing
Structured JSON output with metadata

Note: This README provides a brief overview of the Challenge 1b solution structure based on available sample data.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Collection 1		Collection 1
Collection 2		Collection 2
Collection 3		Collection 3
frontend		frontend
.gitignore		.gitignore
README.md		README.md
main.py		main.py
structure.txt		structure.txt
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Challenge 1b: Multi-Collection PDF Analysis

Overview

Project Structure

Collections

Collection 1: Travel Planning

Collection 2: Adobe Acrobat Learning

Collection 3: Recipe Collection

Input/Output Format

Input JSON Structure

Output JSON Structure

Key Features

About

Uh oh!

Releases

Packages

Languages

Aytaditya/adobe_hackathon

Folders and files

Latest commit

History

Repository files navigation

Challenge 1b: Multi-Collection PDF Analysis

Overview

Project Structure

Collections

Collection 1: Travel Planning

Collection 2: Adobe Acrobat Learning

Collection 3: Recipe Collection

Input/Output Format

Input JSON Structure

Output JSON Structure

Key Features

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages