Skip to content

Commit 2cb9fcd

Browse files
authored
ffs: update design doc (textileio#436)
* update doc Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com> * misspell Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
1 parent acdd24c commit 2cb9fcd

1 file changed

Lines changed: 33 additions & 203 deletions

File tree

ffs/Design.md

Lines changed: 33 additions & 203 deletions
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,10 @@
44

55
This document presents the general design of the `ffs` package of `powergate`.
66

7-
**Disclaimer**: This's ongoing work, so some design, interface definition, etc. might change soon as implementation continues.
7+
**Disclaimer**: This's ongoing work so the design will continue to change.
88

99
The following picture presents principal packages and interfaces that are part of the design:
10-
![FFS Design](https://user-images.githubusercontent.com/6136245/79258737-fbd21c80-7e61-11ea-9616-3521f543184f.png)
10+
![FFS Design](https://user-images.githubusercontent.com/6136245/83649396-847d5700-a58d-11ea-8d93-5ea20ca1bda7.png)
1111

1212

1313
The picture has an advanced scenario where different _API_ instances are wired to different _Scheduler_ instances. Component names prefixed with * don't exist but are mentioned as possible implementations of existing interfaces.
@@ -16,9 +16,9 @@ The central idea about the design is that an _API_ defines the desired storing s
1616

1717
When a new or updated _CidConfig_ is pushed in an _API_, it delegates this work to the _Scheduler_. The _Scheduler_ will execute whatever work is necessary to comply with the new/updated Cid configuration.
1818

19-
From the _Scheduler_ point of view, this work is considered a _Job_ created by _API_. The job refers to doing the necessary work to enforce the new _CidConfig_. The _API_ can watch for this _Job_ state changes to see if the task of pushing a new _CidConfig_ is queued, in progress, executing, executed successfully, or failed. The _Scheduler_ also provides a human-friendly log stream of work being done for a _Cid_.
19+
From the _Scheduler_ point of view, this work is considered a _Job_ created by _API_. The job refers to doing the necessary work to enforce the new _CidConfig_. The _API_ can watch for this _Job_ state changes to see if the task of pushing a new _CidConfig_ is queued, executing, finished successfully, failed, or canceled. The _Scheduler_ also provides a human-friendly log stream of work being done for a _Cid_.
2020

21-
Apart from executing _API_ triggered work, like pushing a new _CidConfig_, the _Scheduler_ also does some background jobs related to deals-renewal for Cids, which have this feature enabled in their _CidConfig_. Similarly, it has background jobs for repair actions.
21+
The _Scheduler_ also executes proactive actions for prior pushed _CidConfigs_ which enabled the _renew_ or _repair_ feature. Finally, the _Scheduler_ is designed to resume any kind of interrupted job executions.
2222

2323
## Components
2424
The following sections give a more detailed description of each component and interface in the diagram.
@@ -28,229 +28,59 @@ This component is responsible for creating _API_ instances. When a new _API_ ins
2828

2929
The mapping between _auth-tokens_ and _API_ is controlled by an _Auth_ component. Further features such as token invalidation, finer-grained access control per action, or multiple auth token support will live in this module.
3030

31-
Every _API_ instance needs a dedicated Filecoin address that will be used to pay for actions done on the network. _Manager_ delegates wallet related activities to _WalletManager_, such as: creating new addresses for new _API_ instances, sending funds to those addresses, getting the balance.
31+
Since _API_ might store data in the Filecoin network, they're asigned a newly created Filecoin address which will be controlled by the underlying Filecoin client used in the _ColdStorage_. The process of creating and assigning this new wallet account is done automatically by _Manager_, using a subcomponent _WalletManager_.
32+
33+
_Manager_ enables being configured to auto-fund newly created wallet addresses, so new created _API_ can have funds to execute actions in the Filecoin network. This feature can be optionally enabled. If enabled, a _masterAddress_ and _initialFunds_ will be configured which indicates from which Filecoin Client wallet address funds will be sent and the amount of the transfer.
34+
3235

3336
### API
3437
_API_ is a concrete instance of FFS to be used by a client.
3538
It owns the following information:
36-
- A Filecoin address.
39+
- At least one Filecoin address. Later the client can opt to create more address and indicate which to use when making action.
3740
- _CidConfigs_ describing the desired state for Cids to be stored in Hot and Cold storage.
3841
- A default _CidConfig_ to be used unless an explicit _CidConfig_ is given.
3942

40-
It has APIs to create/update _CidConfigs_, get its address information such as balance, watch for _Job_ state changes or human-friendly Log outputs about work done by the _Scheduler_. Refer to the _CidConfig_ section to understand more about this important structure.
43+
The instance provides apis to:
44+
- Get and Set the default _CidConfig_ used to store new data.
45+
- Get summary information about all the _Cid_ stored in this instance.
46+
- Manage Filecoin wallet addresses under its control.
47+
- Sending FIL transactions from owned Filecoin wallet addresses.
48+
- Create, replace and remove _CidConfig_ which indicates which cids to store in the instance.
49+
- Provide detailed information about a particular stored Cid.
50+
- Get information about status of executing _Jobs_ corresponding to the FFS instance.
51+
- Human-friendly log streams about events happening for a _Cid_, from storage, renewals, repair and anything related to actions being done for it.
4152

4253
### Scheduler
4354

44-
The main goal of this component is to do whatever its possible to reach a desired storing state for a _Cid_.
45-
46-
Its interface for _API_ is defined by the interface:
47-
```go
48-
// Scheduler enforces a CidConfig orchestrating Hot and Cold storages.
49-
type Scheduler interface {
50-
// PushConfig push a new or modified configuration for a Cid. It returns
51-
// the JobID which tracks the current state of execution of that task.
52-
PushConfig(APIID, string, CidConfig) (JobID, error)
53-
54-
// PushReplace push a new or modified configuration for a Cid, replacing
55-
// an existing one. The replaced Cid will be unstored from the Hot Storage.
56-
// Also it will be untracked (refer to Untrack() to understand implications)
57-
PushReplace(APIID, string, CidConfig, cid.Cid) (JobID, error)
58-
59-
// GetCidInfo returns the current Cid storing state. This state may be different
60-
// from CidConfig which is the *desired* state.
61-
GetCidInfo(cid.Cid) (CidInfo, error)
62-
63-
// GetCidFromHot returns an Reader with the Cid data. If the data isn't in the Hot
64-
// Storage, it errors with ErrHotStorageDisabled.
65-
GetCidFromHot(context.Context, cid.Cid) (io.Reader, error)
66-
67-
// GetJob gets the a Job.
68-
GetJob(JobID) (Job, error)
55+
In a nutshell, the _Scheduler_ is the component responsible for orchestrating the Hot and Cold storage to enforce indicated _CidConfigs_ by connected _API_.
6956

70-
// WatchJobs is a blocking method that sends to a channel state updates
71-
// for all Jobs created by an Instance. The ctx should be canceled when
72-
// to stop receiving updates.
73-
WatchJobs(context.Context, chan<- Job, APIID) error
74-
75-
// WatchLogs writes new log entries from Cid related executions.
76-
// This is a blocking operation that should be canceled by canceling the
77-
// provided context.
78-
WatchLogs(context.Context, chan<- LogEntry) error
79-
80-
//Untrack marks a Cid to be untracked for any background processes such as
81-
// deal renewal, or repairing.
82-
Untrack(cid.Cid) error
83-
}
84-
```
57+
Refer to the [Go docs](https://pkg.go.dev/github.com/textileio/powergate/ffs/scheduler?tab=doc) to see its exported API.
8558

8659
### Responsibilities
87-
When a new/updated _CidConfig_ is pushed by an _API_, the _Scheduler_ bounds the work of enforcing that state in a _Job_.
88-
This _Job_ has a lifecycle: Queued, Executing, Success, Canceled, or Failed.
60+
When a new _CidConfig_ is pushed by an _API_, the _Scheduler_ is responsible for orchestrating whatever actions are necessary to enforce it with the Hot and Col storage.
61+
62+
Every new _CidConfig_, being the first or newer version for a Cid, is encapsulated in a _Job_. A _Job_ is the unit of work which the _Scheduler_ executes. _Jobs_ have different status: _Queued_, _Executing_, _Done_, _Failed_, and _Canceled_.
8963

64+
Apart from executing _Jobs_, the _Scheduler_ has background processes to keep enforcing configuration features that requires tracking. For example, if a _CidConfig_ has renewal or repair enabled, the _Scheduler_ is responsible for do necessary work as expected.
9065
Apart from _Jobs_, the _Scheduler_ has background tasks that monitor deal renewals or repair operations.
9166

92-
In summary, the _Scheduler_ is concerned about enforcing a _CidConfig_ for a Cid. It does this by inspecting the current state of the Cid in both storages, deciding on which is the necessary actions to make in both layers, and using the Hot and Cold storage APIs to execute that necessary work.
67+
In summary, _APIs_ delegates *the desired state for a Cid* and the _Scheduler_ is responsible for *ensuring that state is true* by orchestrating the Hot and Cold storage.
9368

9469
#### Hot and Cold storage abstraction
95-
The _Scheduler_ is abstracted from particular implementations of the _Hot Storage_ and _Cold Storage_.
96-
It relies on the following interfaces:
97-
98-
```go
99-
// HotStorage is a fast storage layer for Cid data.
100-
type HotStorage interface {
101-
// Add adds io.Reader data ephemerally (not pinned).
102-
Add(context.Context, io.Reader) (cid.Cid, error)
103-
104-
// Remove removes a stored Cid.
105-
Remove(context.Context, cid.Cid) error
106-
107-
// Get retrieves a stored Cid data.
108-
Get(context.Context, cid.Cid) (io.Reader, error)
70+
The _Scheduler_ interacts with abstractions for the Hot and Cold storage.
71+
Refer to the Go docs of the [HotStorage](https://pkg.go.dev/github.com/textileio/powergate@v0.0.1-beta.6/ffs?tab=doc#HotStorage) and [ColdStorage](https://pkg.go.dev/github.com/textileio/powergate@v0.0.1-beta.6/ffs?tab=doc#ColdStorage) to understand their APIs.
10972

110-
// Store stores a Cid. If the data wasn't previously Added,
111-
// depending on the implementation it may use internal mechanisms
112-
// for pulling the data, e.g: IPFS network
113-
Store(context.Context, cid.Cid) (int, error)
73+
It can be noticed that the _ColdStorage_ interface is quite biased towards using a _Filecoin client_ in the implementation, but this enables to include also other tiered cold storages if wanted if deal creation or retrieval may be wanted. Refer to the diagram at the top of this document to understand possible configurations.
11474

115-
// Replace replaces a stored Cid with a new one. It's mostly
116-
// thought for mutating data doing this efficiently.
117-
Replace(context.Context, cid.Cid, cid.Cid) (int, error)
118-
119-
// Put adds a raw block.
120-
Put(context.Context, blocks.Block) error
121-
122-
// IsStore returns true if the Cid is stored, or false
123-
// otherwise.
124-
IsStored(context.Context, cid.Cid) (bool, error)
125-
}
126-
127-
// ColdStorage is slow/cheap storage for Cid data. It has
128-
// native support for Filecoin storage.
129-
type ColdStorage interface {
130-
// Store stores a Cid using the provided configuration and
131-
// account address. It returns a slice of deal errors happened
132-
// during execution.
133-
Store(context.Context, cid.Cid, FilConfig) (FilInfo, []DealError, error)
134-
135-
// Fetch fetches the cid data in the underlying storage.
136-
Fetch(context.Context, cid.Cid, car.Store, string) error
137-
138-
// EnsureRenewals executes renewal logic for a Cid under a particular
139-
// configuration. It returns a slice of deal errors happened during execution.
140-
EnsureRenewals(context.Context, cid.Cid, FilInfo, FilConfig) (FilInfo, []DealError, error)
141-
142-
// IsFIlDealActive returns true if the proposal Cid is active on chain;
143-
// returns false otherwise.
144-
IsFilDealActive(context.Context, cid.Cid) (bool, error)
145-
}
146-
```
147-
148-
#### MinerSelector abstraction
149-
It also relies on a _MinerSelector_ interfaces which implement a particular strategy to fetch the most desirable N miners needed for making deals in the _Cold Storage_:
150-
```go
151-
// MinerSelector returns miner addresses and ask storage information using a
152-
// desired strategy.
153-
type MinerSelector interface {
154-
// GetMiners returns a specified amount of miners that satisfy
155-
// provided filters.
156-
GetMiners(int, MinerSelectorFilter) ([]MinerProposal, error)
157-
}
158-
```
159-
Particular implementations of _MinerSelector_ include:
160-
- _FixedMiners_: which always returns a particular fixed list of miner addresses.
161-
- _ReputationSorted_: which returns the miner addresses using a reputation system built on top of miner information.
162-
163-
164-
#### Configuration scenarios
165-
Looking at diagram in the _Overview_ section we can see some different Hot and Cold storages:
166-
167-
In the first dotted box, a _Scheduler_ uses an _IPFS Node_ as the _HotStorage_ using the _CoreIPFS_ adapter as the interface implementation, which uses the _http api_ client to talk with the _IPFS node_. It also uses the _ColdFil_ adapter as the _ColdStorage_ implementation, which uses the _DealModule_ to make deals with a _Lotus instance_. It uses a _ReputationSorted_ implementation of _MinerSelector_ to fetch the best miners from a miner's reputation system.
168-
169-
In the second dotted box, shows another possible configuration in which uses an _IPFS Cluster_ with a _HotIpfsCluster_ adapter of _HotStorage_; or even a more advanced _HotStorage_ called _HotS3IpfsCluster_ which saves _Cid_ into _IPFS Cluster_ and some _AWS S3_ instance. The _MinerSelector_ implementation for the _ColdStorage_ is _FixedMiners_ which always returns a configured fixed list of miners to make deals with.
75+
The _ColdStorage_ relies on a _MinerSelector_ interface to query the universe of available miners to make new deals. Refer to the [Go doc](https://pkg.go.dev/github.com/textileio/powergate/ffs@v0.0.1-beta.6?tab=doc#MinerSelector) to understand its API.
17076

77+
Powergate has the _Reputation Module_ which leverages built indexes about miners data to provide a universe of available miners soreted by a chosen criteria. In a full run of FFS, the _ColdStorage_ is connected to a _MinerSelector_ with the _Reputation Module_ implementation. However, for integration tests a _FixedMiners_ miner selector is used to bound the universe of available miners for deals to desired values.
17178

79+
The _MinerSelector_ API already provides enough filtering configuration to force using or excluding particular miners. In general, other implementations than the default one should be used if the universe of available miners wants to be completely controlled by design, and not by available miners on the connected Filecoin network.
17280

17381
### Cid Configuration
174-
Cid configurations are a central part of FFS mechanics. An _API_ defines the desired state of the Cid in the Hot and Cold storage. Currently, it has the following structure:
175-
```go
176-
// CidConfig has a Cid desired storing configuration for a Cid in
177-
// Hot and Cold storage.
178-
type CidConfig struct {
179-
// Cid is the Cid of the stored data.
180-
Cid cid.Cid
181-
// Hot has desired storing configuration in Hot Storage.
182-
Hot HotConfig
183-
// Cold has desired storing configuration in the Cold Storage.
184-
Cold ColdConfig
185-
}
186-
187-
// HotConfig is the desired storage of a Cid in a Hot Storage.
188-
type HotConfig struct {
189-
// Enable indicates if Cid data is stored. If true, it will consider
190-
// further configurations to execute actions.
191-
Enabled bool
192-
// AllowUnfreeze indicates that if data isn't available in the Hot Storage,
193-
// it's allowed to be feeded by Cold Storage if available.
194-
AllowUnfreeze bool
195-
// Ipfs contains configuration related to storing Cid data in a IPFS node.
196-
Ipfs IpfsConfig
197-
}
198-
199-
// IpfsConfig is the desired storage of a Cid in IPFS.
200-
type IpfsConfig struct {
201-
// AddTimeout is an upper bound on adding data to IPFS node from
202-
// the network before failing.
203-
AddTimeout int
204-
}
205-
206-
// ColdConfig is the desired state of a Cid in a cold layer.
207-
type ColdConfig struct {
208-
// Enabled indicates that data will be saved in Cold storage.
209-
// If is switched from false->true, it will consider the other attributes
210-
// as the desired state of the data in this Storage.
211-
Enabled bool
212-
// Filecoin describes the desired Filecoin configuration for a Cid in the
213-
// Filecoin network.
214-
Filecoin FilConfig
215-
}
216-
217-
// FilConfig is the desired state of a Cid in the Filecoin network.
218-
type FilConfig struct {
219-
// RepFactor indicates the desired amount of active deals
220-
// with different miners to store the data. While making deals
221-
// the other attributes of FilConfig are considered for miner selection.
222-
RepFactor int
223-
// DealDuration indicates the duration to be used when making new deals.
224-
DealDuration int64
225-
// ExcludedMiners is a set of miner addresses won't be ever be selected
226-
// when making new deals, even if they comply to other filters.
227-
ExcludedMiners []string
228-
// TrustedMiners is a set of miner addresses which will be forcibly used
229-
// when making new deals. An empty/nil list disables this feature.
230-
TrustedMiners []string
231-
// CountryCodes indicates that new deals should select miners on specific
232-
// countries.
233-
CountryCodes []string
234-
// Renew indicates deal-renewal configuration.
235-
Renew FilRenew
236-
// Addr is the wallet address used to store the data in filecoin
237-
Addr string
238-
}
239-
240-
// FilRenew contains renew configuration for a Cid Cold Storage deals.
241-
type FilRenew struct {
242-
// Enabled indicates that deal-renewal is enabled for this Cid.
243-
Enabled bool
244-
// Threshold indicates how many epochs before expiring should trigger
245-
// deal renewal. e.g: 100 epoch before expiring.
246-
Threshold int
247-
}
248-
```
249-
250-
Each attribute has a description of its goal.
251-
252-
Both the Hot and Cold configurations have an `Enable` flag to enable/disable the Cid data storage in each of them.
253-
If a client only wants to save data in the Cold storage, it can set `HotConfig.Enabled: false` and `ColdConfig.Enabled: true`. The same applies inversely.
82+
In the current document we've referred to _CidConfigs_ as a central concept in the FFS module. A _CidConfig_ indicates the desired storing state of a _Cid_ scoped in a _API_. Refer to the [Go docs](https://pkg.go.dev/github.com/textileio/powergate/ffs@v0.0.1-beta.6?tab=doc#CidConfig) to understand its rich configuration.
83+
25484

25585
#### _API_ _Get(...)_ operation
25686
One important point is that `Get` operations in _API_ can only retrieve data from the Hot Storage (via `GetCidFromHot` in the _Scheduler_).

0 commit comments

Comments
 (0)