Open SourceMerged PR

dbt-score Open Source Contribution

Added seed resource support to dbt metadata quality linter

Overview

dbt-score is an open-source tool developed by Picnic that helps data teams maintain high-quality dbt projects by scoring models based on configurable rules. I contributed a significant feature that extends the tool's capabilities to support seed resources, making it possible to enforce data quality standards across all dbt resource types.

Merged Contribution: Pull Request #110 was successfully merged into the main branch, adding comprehensive seed support with full test coverage and documentation.

The Challenge

dbt-score originally only supported scoring of models, but many dbt projects rely heavily on seed files for reference data, lookup tables, and configuration. Without the ability to score seeds, teams couldn't ensure consistent quality standards across their entire dbt project.

Key Challenges

Seeds have different metadata structure compared to models
Need to maintain backward compatibility with existing rules
Ensure comprehensive test coverage for all new functionality
Follow established project conventions and coding standards

Solution

I implemented a comprehensive solution that seamlessly integrates seed support into dbt-score while maintaining the tool's existing architecture and user experience.

Implementation Details

1. Extended Core Functionality

• Added seed resource type to the scoring engine
• Implemented seed-specific metadata parsing
• Created seed manifest loader functionality
• Maintained consistent API with model scoring

2. Created Seed-Specific Rules

Implemented 4 new linting rules:

seed_has_descriptionseed_has_meta_keysseed_has_tagsseed_has_tests

3. Test-Driven Development

• Wrote comprehensive unit tests for all new functionality
• Added integration tests for end-to-end scenarios
• Achieved 100% test coverage for new code
• Ensured all existing tests continued to pass

Technical Implementation

Technical Approach

# Example of the new seed scoring functionality
from dbt_score import score_seeds

# Score all seeds in the project
results = score_seeds(
    manifest_path="target/manifest.json",
    rules=[
        "seed_has_description",
        "seed_has_tests",
        "seed_has_tags",
        "seed_has_meta_keys"
    ]
)

# Seeds are now scored just like models
for seed_name, score in results.items():
    print(f"{seed_name}: {score}/10")

Development Process

1.
Research & Planning: Analyzed existing codebase and dbt seed structure to understand integration requirements
2.
Implementation: Developed feature using TDD, writing tests first and then implementing functionality
3.
Documentation: Updated README and added examples for using the new seed scoring features
4.
Review & Iteration: Addressed maintainer feedback and refined implementation based on suggestions

Project Details

Role: Open Source Contributor
Context: Community Contribution
Timeline: 2024
Client: Picnic Supermarket

Tech Stack

Technologies

PythondbtPyTestTest-Driven DevelopmentGitHub Actions

Impact

PR #110

Contribution

Feature

Impact Level

100%

Test Coverage

Key Achievements

Contribution Impact

• Successfully merged as Pull Request #110
• Extended dbt-score to support all dbt resource types
• Added 4 new seed-specific linting rules
• Maintained 100% backward compatibility
• Comprehensive test suite with full coverage
• Clear documentation and usage examples

Community Reception

The contribution was well-received by the dbt-score maintainers and community:

• Praised for following project conventions and coding standards
• Appreciated the comprehensive test coverage
• Feature requested by multiple users in the community
• Smooth integration with minimal review iterations

View the merged pull request on GitHub