Sven and I agreed that we should start a slightly more structured issue around the proposal of a metadata file that packages may add. So here goes.
Why?
The primary goal of this site is to help people make better decisions about the packages they are choosing. The metadata we currently use to help people make those decisions comes from the manifest, the repository, and GitHub. There's more we could do with better information though, so we're considering proposing a standard metadata file that package authors can use to inform the Swift Package Index better. It would also mean that other hosting providers (for example GitLab and Bitbucket, as well as self-hosted repositories) would be on equal footing with GitHub.
What is the file, and where is it stored?
We see this metadata file hosted in the root of a package repository. It's better for everyone if the package author is in control of the file, and of course, it means that other projects can also take advantage of the information.
What information would it include?
This thread aims to gather feedback from the community on what information would be useful in this metadata file. I will update the list here as feedback comes in. Here's what we have so far:
- Package information:
- Abstract (thanks Erica)
- A shorter, one line description.
- Description (details here)
- Categories/Tags:
- Manually defined (thanks to Erica)
- Some tags could potentially be derived from imported frameworks (see here). This would not be in the metadata file of course, but I think is worth mentioning here.
- Home page URL
- Documentation page URL
- Example project URL (thanks James)
- Auxiliary URLs (thanks Dave)
- A set of any number of other URLs to other resources, we’d need to capture both a URL and the text to use as the link.
- Is the package deprecated?
- If so, is there a successor package?
- Related packages: (multiple items of) (thanks James)
- Package URL (The name and all other metadata can be derived from this)
- License (thanks Johan) - See comments here too.
- Maintainer: (thanks Erica) (exactly one of)
- Author information: (multiple sets of)
- Name
- Email address
- Personal URL (Home page, Twitter, GitHub page, etc…)
- Funding/sponsorship/donation information: (thanks Max)
- Does the package accept funding?
- Funding URL
- Other platform support status:
- Linux (thanks Max)
- A boolean is the simplest way to declare support.
- Explicit support for named versions of Linux is more comprehensive.
- Windows, and other platforms as they are added. (thanks Erica)
By definition, as the file will not exist in all repositories, all of this data will be optional. No package author should be required to add any data that they are not comfortable with sharing.
There is a valid issue with versioning the metadata file brought up by Mattt here. For the Swift Package Index, while we would potentially store this information against the versions, The Swift Package Index would use the latest version of the file on the default branch to build the package pages from.
Structured or Unstructured data
There's an argument to make all non-technical metadata a "tag" like structure. For example indicating categories, Linux support, and other things with string tags as mentioned by Erica here.
We want to bring as much of the information that's needed to judge the quality of a package into one place. For example, instead of having to check how many pull requests/issues there are and when the last one was closed, we bring that in automatically, right alongside information about what versions of Swift the package supports, and whether the stable release is the right one to target, or if there's actually a beta which would better suit your needs.
All of that data so far is structured as it comes from the manifest, from GitHub, and from the repository itself. There's a place for unstructured/tag-based data, but I don't think it completely replaces the need for structure.
We also want to use some of this structured data to drive a "quality score" for a package. I don't think it's clear yet whether this quality score is made public, or just used internally for search ranking (we have a version of this already) there are pros and cons to both. But, if metadata is just tag-based, it's much harder to do that. Especially when tags can be typed incorrectly or interpreted in different ways (do linux
and ubuntu-18.04
get points for supporting Linux, where ubuntu1804
doesn't?). It's definitely a trade-off. -- Just a note, I'm not saying packages would definitely get an increased score for supporting Linux, it's just an illustration.
Scope of this thread
I think it's worth keeping the discussion to the information at this point, rather than the specifics of the format that we’ll use to represent it. That's a separate discussion, and the data we decide to include will influence the format.
At this point, we should include ALL suggestions for metadata. Of course, it's fine to put forward views on why you feel a piece of metadata shouldn't be considered. But I won't remove any until we've got a comprehensive collection of everything under consideration though. I’ll keep this list above up to date as more suggestions are added in comments below.
Eventually moving metadata into Package.swift
If this process is successful, it's worth considering whether this metadata should be merged into the Package.swift
manifest through the evolution process. It's an idea, but probably only for some of the metadata.
The package manifest holds information about the technical details of the package, and I think we should be careful mixing in descriptive metadata in with that. So, if we see that Linux support is something we see people use this metadata file for. We think that would make a great addition to the official manifest. However, for things like description, tags, author information, etc… a separate file feels better.