Data Sources
Transparency about our data sources and collection methods is essential for building trust. This page explains where our content comes from, how it's collected, and how we ensure quality and accuracy.
Primary Data Sources
- Partner Archives: Collaborations with institutional archives, libraries, and cultural organizations that specialize in movement documentation.
- Open Repositories: Content from publicly accessible repositories, Creative Commons platforms, and open access databases.
- Online Sources: Websites, YouTube channels, social media platforms, and other online resources related to artivism. We reference and link to these sources, respecting fair use principles for educational and documentation purposes.
- Community Contributions: User-submitted content that meets our editorial standards and inclusion criteria.
- Curated Collections: Editorial collections assembled by our team to highlight specific movements, themes, or time periods.
- Public Domain Works: Historical content that has entered the public domain and is freely available for use.
Collection Methodology
Automated Collection
Some content is collected automatically from partner sources and open repositories. Automated processes:
- Monitor RSS feeds and API endpoints from partner sources
- Extract metadata using standardized schemas
- Validate license information and accessibility
- Flag items for manual review when needed
Manual Curation
All content undergoes manual review to ensure quality and accuracy. Our curation process includes:
- Verification of metadata accuracy
- License validation and attribution checking
- Content relevance assessment
- Quality control and standardization
Update Frequency
The archive is continuously updated:
- Automated sources: Daily checks for new content
- Manual curation: Weekly additions and updates
- Community contributions: Reviewed within 5-7 business days
- Metadata updates: Ongoing corrections and improvements
Quality Assurance
Data Validation
All metadata is validated against our schema requirements. Required fields include title, author, date, and license information. Optional fields are standardized when present.
Duplicate Detection
We use automated and manual processes to detect and merge duplicate records. When duplicates are found, we preserve the most complete metadata and link related versions.
Error Correction
Errors are corrected through:
- Automated validation checks
- User-reported corrections
- Regular manual review cycles
- Version tracking for significant changes
Transparency
We maintain transparency about data quality by documenting known limitations, update frequencies, and source reliability. See our corrections policy for more information.
Known Limitations
While we strive for comprehensiveness, the archive has some limitations:
- Coverage gaps: Some movements or time periods may be underrepresented due to limited source availability
- Temporal bias: Recent content may be more thoroughly documented than historical content
- Geographic bias: Coverage varies by region based on source availability and language
- Metadata completeness: Historical content may have incomplete metadata compared to contemporary items
We continuously work to address these limitations through expanded partnerships, improved collection methods, and community contributions.
Data Reliability
We ensure data reliability through:
- Regular validation and quality checks
- Source verification and documentation
- Version control for significant changes
- Clear attribution of sources and licenses
- Transparent documentation of limitations
Learn More
Editorial Policy — How we curate and manage content
How It Works — System architecture overview
Contribute — Help us expand the archive