file_storage.md 8.33 KB
Newer Older
1 2 3 4 5 6
# File Storage in GitLab

We use the [CarrierWave] gem to handle file upload, store and retrieval.

There are many places where file uploading is used, according to contexts:

7
- System
8 9
  - Instance Logo (logo visible in sign in/sign up pages)
  - Header Logo (one displayed in the navigation bar)
10
- Group
11
  - Group avatars
12
- User
13 14
  - User avatars
  - User snippet attachments
15
- Project
16
  - Project avatars
17 18
  - Issues/MR/Notes Markdown attachments
  - Issues/MR/Notes Legacy Markdown attachments
Shinya Maeda's avatar
Shinya Maeda committed
19
  - CI Artifacts (archive, metadata, trace)
20
  - LFS Objects
21
  - Merge request diffs
22 23 24 25 26 27

## Disk storage

GitLab started saving everything on local disk. While directory location changed from previous versions,
they are still not 100% standardized. You can see them below:

28
| Description                           | In DB? | Relative path (from CarrierWave.root)                       | Uploader class         | model_type |
29 30 31 32 33 34 35
| ------------------------------------- | ------ | ----------------------------------------------------------- | ---------------------- | ---------- |
| Instance logo                         | yes    | uploads/-/system/appearance/logo/:id/:filename              | `AttachmentUploader`   | Appearance |
| Header logo                           | yes    | uploads/-/system/appearance/header_logo/:id/:filename       | `AttachmentUploader`   | Appearance |
| Group avatars                         | yes    | uploads/-/system/group/avatar/:id/:filename                 | `AvatarUploader`       | Group      |
| User avatars                          | yes    | uploads/-/system/user/avatar/:id/:filename                  | `AvatarUploader`       | User       |
| User snippet attachments              | yes    | uploads/-/system/personal_snippet/:id/:random_hex/:filename | `PersonalFileUploader` | Snippet    |
| Project avatars                       | yes    | uploads/-/system/project/avatar/:id/:filename               | `AvatarUploader`       | Project    |
36 37
| Issues/MR/Notes Markdown attachments        | yes    | uploads/:project_path_with_namespace/:random_hex/:filename  | `FileUploader`         | Project    |
| Issues/MR/Notes Legacy Markdown attachments | no     | uploads/-/system/note/attachment/:id/:filename              | `AttachmentUploader`   | Note       |
Shinya Maeda's avatar
Shinya Maeda committed
38
| CI Artifacts (CE)                     | yes    | shared/artifacts/:disk_hash[0..1]/:disk_hash[2..3]/:disk_hash/:year_:month_:date/:job_id/:job_artifact_id (:disk_hash is SHA256 digest of project_id) | `JobArtifactUploader`  | Ci::JobArtifact  |
39
| LFS Objects  (CE)                     | yes    | shared/lfs-objects/:hex/:hex/:object_hash                   | `LfsObjectUploader`    | LfsObject  |
40
| External merge request diffs          | yes    | shared/external-diffs/merge_request_diffs/mr-:parent_id/diff-:id | `ExternalDiffUploader` | MergeRequestDiff |
41 42

CI Artifacts and LFS Objects behave differently in CE and EE. In CE they inherit the `GitlabUploader`
43
while in EE they inherit the `ObjectStorage` and store files in and S3 API compatible object store.
44

45
In the case of Issues/MR/Notes Markdown attachments, there is a different approach using the [Hashed Storage] layout,
46 47 48
instead of basing the path into a mutable variable `:project_path_with_namespace`, it's possible to use the
hash of the project ID instead, if project migrates to the new approach (introduced in 10.2).

49 50 51 52 53
> Note: We provide an [all-in-one rake task] to migrate all uploads to object
> storage in one go. If a new Uploader class or model type is introduced, make
> sure you add a rake task invocation corresponding to it to the [category
> list].

54 55
### Path segments

56
Files are stored at multiple locations and use different path schemes.
57 58 59 60 61 62 63 64
All the `GitlabUploader` derived classes should comply with this path segment schema:

```
|   GitlabUploader
| ----------------------- + ------------------------- + --------------------------------- + -------------------------------- |
| `<gitlab_root>/public/` | `uploads/-/system/`       | `user/avatar/:id/`                | `:filename`                      |
| ----------------------- + ------------------------- + --------------------------------- + -------------------------------- |
| `CarrierWave.root`      | `GitlabUploader.base_dir` | `GitlabUploader#dynamic_segment`  | `CarrierWave::Uploader#filename` |
65
|                         | `CarrierWave::Uploader#store_dir`                             |                                  |
66 67 68 69 70 71 72

|   FileUploader
| ----------------------- + ------------------------- + --------------------------------- + -------------------------------- |
| `<gitlab_root>/shared/` | `artifacts/`              | `:year_:month/:id`                | `:filename`                      |
| `<gitlab_root>/shared/` | `snippets/`               | `:secret/`                        | `:filename`                      |
| ----------------------- + ------------------------- + --------------------------------- + -------------------------------- |
| `CarrierWave.root`      | `GitlabUploader.base_dir` | `GitlabUploader#dynamic_segment`  | `CarrierWave::Uploader#filename` |
73
|                         | `CarrierWave::Uploader#store_dir`                             |                                  |
74 75 76 77 78 79 80
|                         |                           | `FileUploader#upload_path                                            |

|   ObjectStore::Concern (store = remote)
| ----------------------- + ------------------------- + ----------------------------------- + -------------------------------- |
| `<bucket_name>`         | <ignored>                 | `user/avatar/:id/`                  | `:filename`                      |
| ----------------------- + ------------------------- + ----------------------------------- + -------------------------------- |
| `#fog_dir`              | `GitlabUploader.base_dir` | `GitlabUploader#dynamic_segment`    | `CarrierWave::Uploader#filename` |
81
|                         |                           | `ObjectStorage::Concern#store_dir`  |                                  |
82 83 84 85 86 87 88 89 90 91 92
|                         |                           | `ObjectStorage::Concern#upload_path                                    |
```

The `RecordsUploads::Concern` concern will create an `Upload` entry for every file stored by a `GitlabUploader` persisting the dynamic parts of the path using
`GitlabUploader#dynamic_path`. You may then use the `Upload#build_uploader` method to manipulate the file.

## Object Storage

By including the `ObjectStorage::Concern` in the `GitlabUploader` derived class, you may enable the object storage for this uploader. To enable the object storage
in your uploader, you need to either 1) include `RecordsUpload::Concern` and prepend `ObjectStorage::Extension::RecordsUploads` or 2) mount the uploader and create a new field named `<mount>_store`.

Pascal Borreli's avatar
Pascal Borreli committed
93
The `CarrierWave::Uploader#store_dir` is overridden to
94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143

 - `GitlabUploader.base_dir` + `GitlabUploader.dynamic_segment` when the store is LOCAL
 - `GitlabUploader.dynamic_segment` when the store is REMOTE (the bucket name is used to namespace)

### Using `ObjectStorage::Extension::RecordsUploads`

> Note: this concern will automatically include `RecordsUploads::Concern` if not already included.

The `ObjectStorage::Concern` uploader will search for the matching `Upload` to select the correct object store. The `Upload` is mapped using `#store_dirs + identifier` for each store (LOCAL/REMOTE).

```ruby
class SongUploader < GitlabUploader
  include RecordsUploads::Concern
  include ObjectStorage::Concern
  prepend ObjectStorage::Extension::RecordsUploads

  ...
end

class Thing < ActiveRecord::Base
  mount :theme, SongUploader # we have a great theme song!

  ...
end
```

### Using a mounted uploader

The `ObjectStorage::Concern` will query the `model.<mount>_store` attribute to select the correct object store.
This column must be present in the model schema.

```ruby
class SongUploader < GitlabUploader
  include ObjectStorage::Concern

  ...
end

class Thing < ActiveRecord::Base
  attr_reader :theme_store # this is an ActiveRecord attribute
  mount :theme, SongUploader # we have a great theme song!

  def theme_store
    super || ObjectStorage::Store::LOCAL
  end

  ...
end
```

144 145
[CarrierWave]: https://github.com/carrierwaveuploader/carrierwave
[Hashed Storage]: ../administration/repository_storage_types.md
146
[all-in-one rake task]: ../administration/raketasks/uploads/migrate.md
147
[category list]: https://gitlab.com/gitlab-org/gitlab-ce/blob/master/lib/tasks/gitlab/uploads/migrate.rake