You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The picture has an advanced scenario where different _API_ instances are wired to different _Scheduler_ instances. Component names prefixed with * don't exist but are mentioned as possible implementations of existing interfaces.
@@ -16,9 +16,9 @@ The central idea about the design is that an _API_ defines the desired storing s
16
16
17
17
When a new or updated _CidConfig_ is pushed in an _API_, it delegates this work to the _Scheduler_. The _Scheduler_ will execute whatever work is necessary to comply with the new/updated Cid configuration.
18
18
19
-
From the _Scheduler_ point of view, this work is considered a _Job_ created by _API_. The job refers to doing the necessary work to enforce the new _CidConfig_. The _API_ can watch for this _Job_ state changes to see if the task of pushing a new _CidConfig_ is queued, in progress, executing, executed successfully, or failed. The _Scheduler_ also provides a human-friendly log stream of work being done for a _Cid_.
19
+
From the _Scheduler_ point of view, this work is considered a _Job_ created by _API_. The job refers to doing the necessary work to enforce the new _CidConfig_. The _API_ can watch for this _Job_ state changes to see if the task of pushing a new _CidConfig_ is queued, executing, finished successfully, failed, or canceled. The _Scheduler_ also provides a human-friendly log stream of work being done for a _Cid_.
20
20
21
-
Apart from executing _API_ triggered work, like pushing a new _CidConfig_, the _Scheduler_ also does some background jobs related to deals-renewal for Cids, which have this feature enabled in their _CidConfig_. Similarly, it has background jobs for repair actions.
21
+
The _Scheduler_ also executes proactive actions for prior pushed _CidConfigs_ which enabled the _renew_ or _repair_feature. Finally, the _Scheduler_ is designed to resume any kind of interrupted job executions.
22
22
23
23
## Components
24
24
The following sections give a more detailed description of each component and interface in the diagram.
@@ -28,229 +28,59 @@ This component is responsible for creating _API_ instances. When a new _API_ ins
28
28
29
29
The mapping between _auth-tokens_ and _API_ is controlled by an _Auth_ component. Further features such as token invalidation, finer-grained access control per action, or multiple auth token support will live in this module.
30
30
31
-
Every _API_ instance needs a dedicated Filecoin address that will be used to pay for actions done on the network. _Manager_ delegates wallet related activities to _WalletManager_, such as: creating new addresses for new _API_ instances, sending funds to those addresses, getting the balance.
31
+
Since _API_ might store data in the Filecoin network, they're asigned a newly created Filecoin address which will be controlled by the underlying Filecoin client used in the _ColdStorage_. The process of creating and assigning this new wallet account is done automatically by _Manager_, using a subcomponent _WalletManager_.
32
+
33
+
_Manager_ enables being configured to auto-fund newly created wallet addresses, so new created _API_ can have funds to execute actions in the Filecoin network. This feature can be optionally enabled. If enabled, a _masterAddress_ and _initialFunds_ will be configured which indicates from which Filecoin Client wallet address funds will be sent and the amount of the transfer.
34
+
32
35
33
36
### API
34
37
_API_ is a concrete instance of FFS to be used by a client.
35
38
It owns the following information:
36
-
-A Filecoin address.
39
+
-At least one Filecoin address. Later the client can opt to create more address and indicate which to use when making action.
37
40
-_CidConfigs_ describing the desired state for Cids to be stored in Hot and Cold storage.
38
41
- A default _CidConfig_ to be used unless an explicit _CidConfig_ is given.
39
42
40
-
It has APIs to create/update _CidConfigs_, get its address information such as balance, watch for _Job_ state changes or human-friendly Log outputs about work done by the _Scheduler_. Refer to the _CidConfig_ section to understand more about this important structure.
43
+
The instance provides apis to:
44
+
- Get and Set the default _CidConfig_ used to store new data.
45
+
- Get summary information about all the _Cid_ stored in this instance.
46
+
- Manage Filecoin wallet addresses under its control.
47
+
- Sending FIL transactions from owned Filecoin wallet addresses.
48
+
- Create, replace and remove _CidConfig_ which indicates which cids to store in the instance.
49
+
- Provide detailed information about a particular stored Cid.
50
+
- Get information about status of executing _Jobs_ corresponding to the FFS instance.
51
+
- Human-friendly log streams about events happening for a _Cid_, from storage, renewals, repair and anything related to actions being done for it.
41
52
42
53
### Scheduler
43
54
44
-
The main goal of this component is to do whatever its possible to reach a desired storing state for a _Cid_.
45
-
46
-
Its interface for _API_ is defined by the interface:
47
-
```go
48
-
// Scheduler enforces a CidConfig orchestrating Hot and Cold storages.
49
-
typeSchedulerinterface {
50
-
// PushConfig push a new or modified configuration for a Cid. It returns
51
-
// the JobID which tracks the current state of execution of that task.
In a nutshell, the _Scheduler_ is the component responsible for orchestrating the Hot and Cold storage to enforce indicated _CidConfigs_ by connected _API_.
69
56
70
-
// WatchJobs is a blocking method that sends to a channel state updates
71
-
// for all Jobs created by an Instance. The ctx should be canceled when
// WatchLogs writes new log entries from Cid related executions.
76
-
// This is a blocking operation that should be canceled by canceling the
77
-
// provided context.
78
-
WatchLogs(context.Context, chan<-LogEntry) error
79
-
80
-
//Untrack marks a Cid to be untracked for any background processes such as
81
-
// deal renewal, or repairing.
82
-
Untrack(cid.Cid) error
83
-
}
84
-
```
57
+
Refer to the [Go docs](https://pkg.go.dev/github.com/textileio/powergate/ffs/scheduler?tab=doc) to see its exported API.
85
58
86
59
### Responsibilities
87
-
When a new/updated _CidConfig_ is pushed by an _API_, the _Scheduler_ bounds the work of enforcing that state in a _Job_.
88
-
This _Job_ has a lifecycle: Queued, Executing, Success, Canceled, or Failed.
60
+
When a new _CidConfig_ is pushed by an _API_, the _Scheduler_ is responsible for orchestrating whatever actions are necessary to enforce it with the Hot and Col storage.
61
+
62
+
Every new _CidConfig_, being the first or newer version for a Cid, is encapsulated in a _Job_. A _Job_ is the unit of work which the _Scheduler_ executes. _Jobs_ have different status: _Queued_, _Executing_, _Done_, _Failed_, and _Canceled_.
89
63
64
+
Apart from executing _Jobs_, the _Scheduler_ has background processes to keep enforcing configuration features that requires tracking. For example, if a _CidConfig_ has renewal or repair enabled, the _Scheduler_ is responsible for do necessary work as expected.
90
65
Apart from _Jobs_, the _Scheduler_ has background tasks that monitor deal renewals or repair operations.
91
66
92
-
In summary, the _Scheduler_ is concerned about enforcing a _CidConfig_for a Cid. It does this by inspecting the current state of the Cid in both storages, deciding on which is the necessary actions to make in both layers, and using the Hot and Cold storage APIs to execute that necessary work.
67
+
In summary, _APIs_ delegates *the desired state for a Cid* and the _Scheduler_ is responsible for *ensuring that state is true* by orchestrating the Hot and Cold storage.
93
68
94
69
#### Hot and Cold storage abstraction
95
-
The _Scheduler_ is abstracted from particular implementations of the _Hot Storage_ and _Cold Storage_.
96
-
It relies on the following interfaces:
97
-
98
-
```go
99
-
// HotStorage is a fast storage layer for Cid data.
100
-
typeHotStorageinterface {
101
-
// Add adds io.Reader data ephemerally (not pinned).
102
-
Add(context.Context, io.Reader) (cid.Cid, error)
103
-
104
-
// Remove removes a stored Cid.
105
-
Remove(context.Context, cid.Cid) error
106
-
107
-
// Get retrieves a stored Cid data.
108
-
Get(context.Context, cid.Cid) (io.Reader, error)
70
+
The _Scheduler_ interacts with abstractions for the Hot and Cold storage.
71
+
Refer to the Go docs of the [HotStorage](https://pkg.go.dev/github.com/textileio/powergate@v0.0.1-beta.6/ffs?tab=doc#HotStorage) and [ColdStorage](https://pkg.go.dev/github.com/textileio/powergate@v0.0.1-beta.6/ffs?tab=doc#ColdStorage) to understand their APIs.
109
72
110
-
// Store stores a Cid. If the data wasn't previously Added,
111
-
// depending on the implementation it may use internal mechanisms
112
-
// for pulling the data, e.g: IPFS network
113
-
Store(context.Context, cid.Cid) (int, error)
73
+
It can be noticed that the _ColdStorage_ interface is quite biased towards using a _Filecoin client_ in the implementation, but this enables to include also other tiered cold storages if wanted if deal creation or retrieval may be wanted. Refer to the diagram at the top of this document to understand possible configurations.
114
74
115
-
// Replace replaces a stored Cid with a new one. It's mostly
116
-
// thought for mutating data doing this efficiently.
It also relies on a _MinerSelector_ interfaces which implement a particular strategy to fetch the most desirable N miners needed for making deals in the _Cold Storage_:
150
-
```go
151
-
// MinerSelector returns miner addresses and ask storage information using a
152
-
// desired strategy.
153
-
typeMinerSelectorinterface {
154
-
// GetMiners returns a specified amount of miners that satisfy
Particular implementations of _MinerSelector_ include:
160
-
-_FixedMiners_: which always returns a particular fixed list of miner addresses.
161
-
-_ReputationSorted_: which returns the miner addresses using a reputation system built on top of miner information.
162
-
163
-
164
-
#### Configuration scenarios
165
-
Looking at diagram in the _Overview_ section we can see some different Hot and Cold storages:
166
-
167
-
In the first dotted box, a _Scheduler_ uses an _IPFS Node_ as the _HotStorage_ using the _CoreIPFS_ adapter as the interface implementation, which uses the _http api_ client to talk with the _IPFS node_. It also uses the _ColdFil_ adapter as the _ColdStorage_ implementation, which uses the _DealModule_ to make deals with a _Lotus instance_. It uses a _ReputationSorted_ implementation of _MinerSelector_ to fetch the best miners from a miner's reputation system.
168
-
169
-
In the second dotted box, shows another possible configuration in which uses an _IPFS Cluster_ with a _HotIpfsCluster_ adapter of _HotStorage_; or even a more advanced _HotStorage_ called _HotS3IpfsCluster_ which saves _Cid_ into _IPFS Cluster_ and some _AWS S3_ instance. The _MinerSelector_ implementation for the _ColdStorage_ is _FixedMiners_ which always returns a configured fixed list of miners to make deals with.
75
+
The _ColdStorage_ relies on a _MinerSelector_ interface to query the universe of available miners to make new deals. Refer to the [Go doc](https://pkg.go.dev/github.com/textileio/powergate/ffs@v0.0.1-beta.6?tab=doc#MinerSelector) to understand its API.
170
76
77
+
Powergate has the _Reputation Module_ which leverages built indexes about miners data to provide a universe of available miners soreted by a chosen criteria. In a full run of FFS, the _ColdStorage_ is connected to a _MinerSelector_ with the _Reputation Module_ implementation. However, for integration tests a _FixedMiners_ miner selector is used to bound the universe of available miners for deals to desired values.
171
78
79
+
The _MinerSelector_ API already provides enough filtering configuration to force using or excluding particular miners. In general, other implementations than the default one should be used if the universe of available miners wants to be completely controlled by design, and not by available miners on the connected Filecoin network.
172
80
173
81
### Cid Configuration
174
-
Cid configurations are a central part of FFS mechanics. An _API_ defines the desired state of the Cid in the Hot and Cold storage. Currently, it has the following structure:
175
-
```go
176
-
// CidConfig has a Cid desired storing configuration for a Cid in
177
-
// Hot and Cold storage.
178
-
typeCidConfigstruct {
179
-
// Cid is the Cid of the stored data.
180
-
Cid cid.Cid
181
-
// Hot has desired storing configuration in Hot Storage.
182
-
HotHotConfig
183
-
// Cold has desired storing configuration in the Cold Storage.
184
-
ColdColdConfig
185
-
}
186
-
187
-
// HotConfig is the desired storage of a Cid in a Hot Storage.
188
-
typeHotConfigstruct {
189
-
// Enable indicates if Cid data is stored. If true, it will consider
190
-
// further configurations to execute actions.
191
-
Enabledbool
192
-
// AllowUnfreeze indicates that if data isn't available in the Hot Storage,
193
-
// it's allowed to be feeded by Cold Storage if available.
194
-
AllowUnfreezebool
195
-
// Ipfs contains configuration related to storing Cid data in a IPFS node.
196
-
IpfsIpfsConfig
197
-
}
198
-
199
-
// IpfsConfig is the desired storage of a Cid in IPFS.
200
-
typeIpfsConfigstruct {
201
-
// AddTimeout is an upper bound on adding data to IPFS node from
202
-
// the network before failing.
203
-
AddTimeoutint
204
-
}
205
-
206
-
// ColdConfig is the desired state of a Cid in a cold layer.
207
-
typeColdConfigstruct {
208
-
// Enabled indicates that data will be saved in Cold storage.
209
-
// If is switched from false->true, it will consider the other attributes
210
-
// as the desired state of the data in this Storage.
211
-
Enabledbool
212
-
// Filecoin describes the desired Filecoin configuration for a Cid in the
213
-
// Filecoin network.
214
-
FilecoinFilConfig
215
-
}
216
-
217
-
// FilConfig is the desired state of a Cid in the Filecoin network.
218
-
typeFilConfigstruct {
219
-
// RepFactor indicates the desired amount of active deals
220
-
// with different miners to store the data. While making deals
221
-
// the other attributes of FilConfig are considered for miner selection.
222
-
RepFactorint
223
-
// DealDuration indicates the duration to be used when making new deals.
224
-
DealDurationint64
225
-
// ExcludedMiners is a set of miner addresses won't be ever be selected
226
-
// when making new deals, even if they comply to other filters.
227
-
ExcludedMiners []string
228
-
// TrustedMiners is a set of miner addresses which will be forcibly used
229
-
// when making new deals. An empty/nil list disables this feature.
230
-
TrustedMiners []string
231
-
// CountryCodes indicates that new deals should select miners on specific
232
-
// countries.
233
-
CountryCodes []string
234
-
// Renew indicates deal-renewal configuration.
235
-
RenewFilRenew
236
-
// Addr is the wallet address used to store the data in filecoin
237
-
Addrstring
238
-
}
239
-
240
-
// FilRenew contains renew configuration for a Cid Cold Storage deals.
241
-
typeFilRenewstruct {
242
-
// Enabled indicates that deal-renewal is enabled for this Cid.
243
-
Enabledbool
244
-
// Threshold indicates how many epochs before expiring should trigger
245
-
// deal renewal. e.g: 100 epoch before expiring.
246
-
Thresholdint
247
-
}
248
-
```
249
-
250
-
Each attribute has a description of its goal.
251
-
252
-
Both the Hot and Cold configurations have an `Enable` flag to enable/disable the Cid data storage in each of them.
253
-
If a client only wants to save data in the Cold storage, it can set `HotConfig.Enabled: false` and `ColdConfig.Enabled: true`. The same applies inversely.
82
+
In the current document we've referred to _CidConfigs_ as a central concept in the FFS module. A _CidConfig_ indicates the desired storing state of a _Cid_ scoped in a _API_. Refer to the [Go docs](https://pkg.go.dev/github.com/textileio/powergate/ffs@v0.0.1-beta.6?tab=doc#CidConfig) to understand its rich configuration.
83
+
254
84
255
85
#### _API__Get(...)_ operation
256
86
One important point is that `Get` operations in _API_ can only retrieve data from the Hot Storage (via `GetCidFromHot` in the _Scheduler_).
0 commit comments