Skip to content

Change how repo URL are input into criticality_score + and create a docker image #229

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Nov 4, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Ignore irrelevant files
.github
infra
docs
images

# Ignore Dockerfile - this improve caching.
**/Dockerfile

# Ignore the deprecated Python project
criticality_score
*.py
4 changes: 3 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,11 @@ lint: ## Run linter
lint: $(GOLANGCI_LINT)
$(GOLANGCI_LINT) run -c .golangci.yml

docker-targets = build/docker/enumerate-github
docker-targets = build/docker/enumerate-github build/docker/criticality-score
.PHONY: build/docker $(docker-targets)
build/docker: $(docker-targets) ## Build all docker targets
build/docker/criticality-score:
DOCKER_BUILDKIT=1 docker build . -f cmd/criticality_score/Dockerfile --tag $(IMAGE_NAME)-cli
build/docker/enumerate-github:
DOCKER_BUILDKIT=1 docker build . -f cmd/enumerate_github/Dockerfile --tag $(IMAGE_NAME)-enumerate-github

Expand Down
30 changes: 30 additions & 0 deletions cmd/criticality_score/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Copyright 2022 Criticality Score Authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

FROM golang@sha256:122f3484f844467ebe0674cf57272e61981770eb0bc7d316d1f0be281a88229f AS base
WORKDIR /src
ENV CGO_ENABLED=0
COPY go.mod go.sum ./
RUN go mod download
COPY . ./

FROM base AS criticality_score
ARG TARGETOS
ARG TARGETARCH
RUN CGO_ENABLED=0 go build ./cmd/criticality_score

FROM gcr.io/distroless/base:nonroot@sha256:533c15ef2acb1d3b1cd4e58d8aa2740900cae8f579243a53c53a6e28bcac0684
COPY --from=criticality_score /src/criticality_score ./criticality_score
COPY --from=criticality_score --chmod=775 /src/config/scorer/* ./config/scorer/
ENTRYPOINT ["./criticality_score"]
32 changes: 18 additions & 14 deletions cmd/criticality_score/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,14 @@ $ go install github.com/ossf/criticality_score/cmd/criticality_score
## Usage

```shell
$ criticality_score [FLAGS]... IN_FILE
$ criticality_score [FLAGS]... {FILE|REPO...}
```

Project repository URLs are read from the specified `IN_FILE`. If `-` is passed
in as an `IN_FILE` URLs will read from STDIN.
Project repository URLs are read either from the specified `FILE`, or from the
command line arguments.
If `-` is passed in as an `FILE` URLs will read from STDIN. If `FILE` does not
exist it will be treated as a `REPO`.
Each `REPO` is a project repository URLs.

Results are written in CSV format to the output. By default `stdout` is used for
output.
Expand All @@ -38,7 +41,8 @@ output.

### Authentication

`criticality_score` requires authentication to GitHub, and optionally Google Cloud Platform to run.
`criticality_score` requires authentication to GitHub, and optionally Google
Cloud Platform to run.

#### GitHub Authentication

Expand Down Expand Up @@ -104,11 +108,12 @@ See more on GCP

#### Output flags

- `-out FILE` specify the `FILE` to use for output. By default `stdout` is used.
- `-append` appends output to `FILE` if it already exists.
- `-force` overwrites `FILE` if it already exists and `-append` is not set.
- `-out OUTFILE` specify the `OUTFILE` to use for output. By default `stdout` is used.
- `-append` appends output to `OUTFILE` if it already exists.
- `-force` overwrites `OUTFILE` if it already exists and `-append` is not set.

If `FILE` exists and neither `-append` nor `-force` is set the command will fail.
If `OUTFILE` exists and neither `-append` nor `-force` is set the command will
fail.

#### Google Cloud Platform flags

Expand Down Expand Up @@ -194,14 +199,13 @@ Rather than installing the binary, use `go run` to run the command.
For example:

```shell
$ go run ./cmd/criticality_score [FLAGS]... IN_FILE...
$ go run ./cmd/criticality_score [FLAGS]... {FILE|REPO...}
```

Pass in a single repo using echo to quickly test signal collection, for example:
Pass in a single repo to quickly test signal collection, for example:

```shell
$ echo "https://github.com/django/django" | \
go run ./cmd/criticality_score \
-log=debug \
-
$ go run ./cmd/criticality_score \
-log=debug \
https://github.com/django/django
```
139 changes: 139 additions & 0 deletions cmd/criticality_score/input.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
// Copyright 2022 Criticality Score Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// https://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

package main

import (
"bufio"
"context"
"errors"
"io"
"net/url"
"os"

"github.com/ossf/criticality_score/internal/infile"
)

// iter is a simple interface for iterating across a list of items.
//
// This interface is modeled on the bufio.Scanner behavior.
type iter[T any] interface {
// Item returns the current item in the iterator
Item() T

// Next advances the iterator to the next item and returns true if there is
// an item to consume, and false if the end of the input has been reached,
// or there has been an error.
//
// Next must be called before each call to Item.
Next() bool

// Err returns any error produced while iterating.
Err() error
}

// iterCloser is an iter, but also embeds the io.Closer interface, so it can be
// used to wrap a file for iterating through.
type iterCloser[T any] interface {
iter[T]
io.Closer
}

// scannerIter implements iter using a bufio.Scanner to iterate through lines in
// a file.
type scannerIter struct {
r io.ReadCloser
scanner *bufio.Scanner
}

func (i *scannerIter) Item() string {
return i.scanner.Text()
}

func (i *scannerIter) Next() bool {
return i.scanner.Scan()
}

func (i *scannerIter) Err() error {
return i.scanner.Err()
}

func (i *scannerIter) Close() error {
return i.r.Close()
}

// sliceIter implements iter using a slice for iterating.
type sliceIter[T any] struct {
values []T
next int
size int
}

func (i *sliceIter[T]) Item() T {
return i.values[i.next-1]
}

func (i *sliceIter[T]) Next() bool {
if i.next <= i.size {
i.next++
}
return i.next <= i.size
}

func (i *sliceIter[T]) Err() error {
return nil
}

func (i *sliceIter[T]) Close() error {
return nil
}

// initInput returns an iterCloser for iterating across repositories for
// collecting signals.
//
// If only one arg is specified, the code will treat it as a file and attempt to
// open it. If the file doesn't exist, and is parseable as a URL the arg will be
// treated as a repo.
//
// If more than one arg is specified they are all considered to be repos.
//
// TODO: support the ability to force args to be interpreted as either a file,
// or a list of repos.
func initInput(args []string) (iterCloser[string], error) {
if len(args) == 1 {
// If there is 1 arg, attempt to open it as a file.
fileOrRepo := args[0]
_, err := url.Parse(fileOrRepo)
urlParseFailed := err != nil

// Open the in-file for reading
r, err := infile.Open(context.Background(), fileOrRepo)
if err == nil {
return &scannerIter{
r: r,
scanner: bufio.NewScanner(r),
}, nil
} else if err != nil && (urlParseFailed || !errors.Is(err, os.ErrNotExist)) {
// Only report errors if the file doesn't appear to be a URL, or if
// it doesn't exist.
return nil, err
}
}
// If file loading failed, or there are 2 or more args, treat args as a list
// of repos.
return &sliceIter[string]{
size: len(args),
values: args,
}, nil
}
46 changes: 23 additions & 23 deletions cmd/criticality_score/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@
package main

import (
"bufio"
"context"
"errors"
"flag"
Expand All @@ -30,7 +29,6 @@ import (
"go.uber.org/zap/zapcore"

"github.com/ossf/criticality_score/internal/collector"
"github.com/ossf/criticality_score/internal/infile"
log "github.com/ossf/criticality_score/internal/log"
"github.com/ossf/criticality_score/internal/outfile"
"github.com/ossf/criticality_score/internal/scorer"
Expand All @@ -52,22 +50,28 @@ var (
logEnv log.Env
)

func init() {
// initFlags prepares any runtime flags, usage information and parses the flags.
func initFlags() {
flag.Var(&logLevel, "log", "set the `level` of logging.")
flag.TextVar(&logEnv, "log-env", log.DefaultEnv, "set logging `env`.")
outfile.DefineFlags(flag.CommandLine, "out", "force", "append", "FILE")
outfile.DefineFlags(flag.CommandLine, "out", "force", "append", "OUTFILE")
flag.Usage = func() {
cmdName := path.Base(os.Args[0])
w := flag.CommandLine.Output()
fmt.Fprintf(w, "Usage:\n %s [FLAGS]... IN_FILE OUT_FILE\n\n", cmdName)
fmt.Fprintf(w, "Collects signals for each project repository listed.\n")
fmt.Fprintf(w, "IN_FILE must be either a file or - to read from stdin.\n")
fmt.Fprintf(w, "OUT_FILE must be either be a file or - to write to stdout.\n")
fmt.Fprintf(w, "Usage:\n %s [FLAGS]... {FILE|REPO...}\n\n", cmdName)
fmt.Fprintf(w, "Collects signals for a list of project repository urls.\n\n")
fmt.Fprintf(w, "FILE must be either a file or - to read from stdin. If FILE does not\n")
fmt.Fprintf(w, "exist it will be treated as a REPO.\n")
fmt.Fprintf(w, "Each REPO must be a project repository url.\n")
fmt.Fprintf(w, "\nFlags:\n")
flag.PrintDefaults()
}
flag.Parse()
}

// getScorer prepares a Scorer based on the flags passed to the command.
//
// nil will be returned if scoring is disabled.
func getScorer(logger *zap.Logger) *scorer.Scorer {
if *scoringDisableFlag {
logger.Info("Scoring disabled")
Expand Down Expand Up @@ -112,7 +116,7 @@ func generateScoreColumnName(s *scorer.Scorer) string {
}

func main() {
flag.Parse()
initFlags()

logger, err := log.NewLogger(logEnv, logLevel)
if err != nil {
Expand All @@ -125,8 +129,8 @@ func main() {
scoreColumnName := generateScoreColumnName(s)

// Complete the validation of args
if flag.NArg() != 1 {
logger.Error("Must have an input file specified.")
if flag.NArg() == 0 {
logger.Error("An input file or at least one repo must be specified.")
os.Exit(2)
}

Expand All @@ -152,18 +156,15 @@ func main() {
os.Exit(2)
}

inFilename := flag.Args()[0]

// Open the in-file for reading
r, err := infile.Open(context.Background(), inFilename)
// Prepare the input for reading
inputIter, err := initInput(flag.Args())
if err != nil {
logger.With(
zap.String("filename", inFilename),
zap.Error(err),
).Error("Failed to open an input file")
).Error("Failed to prepare input")
os.Exit(2)
}
defer r.Close()
defer inputIter.Close()

// Open the out-file for writing
w, err := outfile.Open(context.Background())
Expand Down Expand Up @@ -222,10 +223,9 @@ func main() {
}
})

// Read in each line from the input files
scanner := bufio.NewScanner(r)
for scanner.Scan() {
line := scanner.Text()
// Read in each repo from the input
for inputIter.Next() {
line := inputIter.Item()

u, err := url.Parse(strings.TrimSpace(line))
if err != nil {
Expand All @@ -242,7 +242,7 @@ func main() {
// Send the url to the workers
repos <- u
}
if err := scanner.Err(); err != nil {
if err := inputIter.Err(); err != nil {
logger.With(
zap.Error(err),
).Error("Failed while reading input")
Expand Down