In this section, we present how to connect GitHub repository to MindsDB.

GitHub is a web-based platform and service that is primarily used for version control and collaborative software development. It provides a platform for developers and teams to host, review, and manage source code for software projects.

Data from GitHub, including issues and PRs, can be utilized within MindsDB to make relevant predictions or automate the issue/PR creation.

Prerequisites

Before proceeding, ensure the following prerequisites are met:

  1. Install MindsDB locally via Docker or use MindsDB Cloud.
  2. To connect GitHub to MindsDB, install the required dependencies following this instruction.
  3. Install or ensure access to GitHub.

Connection

This handler is implemented using the pygithub library, a Python library that wraps GitHub API v3.

The required arguments to establish a connection are as follows:

  • repository is the GitHub repository name.
  • api_key is an optional GitHub API key to use for authentication.
  • github_url is an optional GitHub URL to connect to a GitHub Enterprise instance.

Check out this guide on how to create the GitHub API key.

It is recommended to use the API key to avoid the API rate limit exceeded error.

Here is how to connect the MindsDB GitHub repository:

CREATE DATABASE mindsdb_github
WITH ENGINE = 'github',
PARAMETERS = {
  "repository": "mindsdb/mindsdb"
};

Usage

The mindsdb_github connection contains two tables: issues and pull_requests.

Here is how to query for all issues:

SELECT *
FROM mindsdb_github.issues;

You can run more advanced queries to fetch specific issues in a defined order:

SELECT number, state, creator, assignees, title, labels
FROM mindsdb_github.issues
WHERE state = 'open'
LIMIT 10;

And the same goes for pull requests:

SELECT number, state, title, creator, head, commits
FROM mindsdb_github.pull_requests
WHERE state = 'open'
LIMIT 10;

For more information about available actions and development plans, visit this page.