-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: migration("re-indexing"), backfilling and diasgnostics tooling for the ChainIndexer
#12450
base: feat/msg-eth-tx-index
Are you sure you want to change the base?
feat: migration("re-indexing"), backfilling and diasgnostics tooling for the ChainIndexer
#12450
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the PR title to match https://github.com/filecoin-project/lotus/blob/master/CONTRIBUTING.md#pr-title-conventions
chainindex/api.go
Outdated
return &types.IndexValidation{ | ||
TipsetKey: ts.Key().String(), | ||
Height: uint64(ts.Height()), | ||
Backfilled: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fill out the other fields here using SQL queries.
@rvagg Would be great to have a first round of review here when you get the time. |
ChainIndexer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the PR title to match https://github.com/filecoin-project/lotus/blob/master/CONTRIBUTING.md#pr-title-conventions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the PR title to match https://github.com/filecoin-project/lotus/blob/master/CONTRIBUTING.md#pr-title-conventions
ChainIndexer
ChainIndexer
ChainIndexer
ChainIndexer
ChainIndexer
ChainIndexer
Suggestions from Steve's read of the user doc. Co-authored-by: Steve Loeppky <[email protected]>
|
||
#### The `ChainValidateIndex` JSON RPC API | ||
|
||
Please refer to the [Lotus API documentation](https://github.com/filecoin-project/lotus/blob/master/documentation/en/api-v1-unstable-methods.md) for detailed documentation of the `ChainValidateIndex` JSON RPC API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Link will have updated docs for the ChainIndexer
RPC once we merge this to master.
Okay, starting work on the automated tests as it is the last remaining dev task for this workstream. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great docs updates - thanks.
I read the updated docs and left some comments. I also went through everything yesterday and resolved any conversation that I saw had been incorporated. I think anything still open is still relevant.
The big feedback idea/item from this revision is whether we want to write the chainindexer sqllite state to a different directory since that simplifies the migration and rollback story.
I can meet in office on 2024-09-25 Pacific morning if that helps for closing anything out as I don't want to drag this out on you.
2. **Backup Existing Index Databases** | ||
- Before restarting your Lotus node, back up the directory containing your existing index databases (`MsgIndex`, `EthTxIndex`, and `EventIndex`). | ||
- These databases are located in the `{$LOTUS_PATH/sqlite}` directory. | ||
- Use the following command to copy the entire directory: | ||
```bash | ||
cp -r $LOTUS_PATH/sqlite {destination_path} | ||
``` | ||
- **Note: If you have configured a custom directory path for the Index databases using the `Events.DatabasePath` config option, replace `{$LOTUS_PATH/sqlite}` with your custom path.** | ||
- These backups are essential for potential rollbacks, even though they are not used in the migration process. | ||
|
||
3. **Remove Old Index Files** | ||
- After creating backups, remove the `{$LOTUS_PATH/sqlite}` directory (*or your custom index database path*) using the following command: | ||
```bash | ||
rm -rf $LOTUS_PATH/sqlite | ||
``` | ||
- **Warning: Please ensure and validate that you have made backups of your existing index databases before removing the directory.** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2. **Backup Existing Index Databases** | |
- Before restarting your Lotus node, back up the directory containing your existing index databases (`MsgIndex`, `EthTxIndex`, and `EventIndex`). | |
- These databases are located in the `{$LOTUS_PATH/sqlite}` directory. | |
- Use the following command to copy the entire directory: | |
```bash | |
cp -r $LOTUS_PATH/sqlite {destination_path} | |
``` | |
- **Note: If you have configured a custom directory path for the Index databases using the `Events.DatabasePath` config option, replace `{$LOTUS_PATH/sqlite}` with your custom path.** | |
- These backups are essential for potential rollbacks, even though they are not used in the migration process. | |
3. **Remove Old Index Files** | |
- After creating backups, remove the `{$LOTUS_PATH/sqlite}` directory (*or your custom index database path*) using the following command: | |
```bash | |
rm -rf $LOTUS_PATH/sqlite | |
``` | |
- **Warning: Please ensure and validate that you have made backups of your existing index databases before removing the directory.** | |
2. **Backup and Remove Existing Index Databases** | |
- The backup is accomplished by moving the legacy `{$LOTUS_PATH/sqlite}` directory so the existing index databases (`MsgIndex`, `EthTxIndex`, and `EventIndex`) don't get overwritten. | |
- Use the following command to mv the entire directory: | |
```bash | |
mv $LOTUS_PATH/sqlite $LOTUS_PATH/sqlite-pre-chainindexer | |
``` | |
- **Note: If you had configured a custom directory path for the Index databases using the `Events.DatabasePath` config option, replace `{$LOTUS_PATH/sqlite}` with your custom path.** | |
- These old indexes are essential for potential rollbacks, even though they are not used in the migration process. |
I was realizing we can hit two birds with one stone here: moving accomplishes the backup and the "removal".
That said, we could simplify this further if chaindindexer didn't use $LOTUS_PATH/sqlite
. What if instead we wrote to $LOTUS_PATH/chainindex
? (I don't have access to a full lotus node to come up with other ideas. ). I assume we don't want to write under chain
even though these indices are related to the chain. ChatGPT tells me there is a indices
directory on full nodes, but I don't know what that holds
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is done.
In case you need to rollback to the previous indexing system (`EthTxIndex`, `MsgIndex`, and `EventIndex`), follow these steps: | ||
1. Stop your Lotus node. | ||
2. Remove the current `${LOTUS_PATH}/sqlite` directory and replace it with the backup taken in the "**Backup Existing Index Databases**" section of the [Migration Guide](#migration-guide). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2. Remove the current `${LOTUS_PATH}/sqlite` directory and replace it with the backup taken in the "**Backup Existing Index Databases**" section of the [Migration Guide](#migration-guide). | |
2. Replace the current `${LOTUS_PATH}/sqlite` with the backup taken in the "**Backup Existing Index Databases**" section of the [Migration Guide](#migration-guide). | |
```bash | |
mv $LOTUS_PATH/sqlite-pre-chainindexer $LOTUS_PATH/sqlite | |
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related to previous discussion, this step goes away if we just write chainindexer sql state to somewhere else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would love to know what @rvagg thinks about this. Writing the chainindexer sql state to another directory makes sense to me as it does simplify migration ops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aarshkshah1992 : @rvagg and I spoke verbally on 2024-09-25 (my time) and he was supportive of this idea of writing chainindexer sqllite in a different directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BigLep This was a great idea and is now done. These steps look much cleaner now. Please take a look.
api/api_full.go
Outdated
// - error: An error object if the validation/backfill fails. The error message will contain details about the index | ||
// corruption if the call fails because of an incosistency between indexed data and the actual chain state. | ||
// | ||
// Note: The API returns an error if the index does not have data for the specified epoch and backfill is set to false. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inline this into your documentation above about "error'?
api/api_full.go
Outdated
// IndexValidation contains detailed information about the validation status of a specific chain epoch. | ||
//type IndexValidation struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is docs about IndexValidation, should these comments be moved to where IndexValidation is defined?
Co-authored-by: Steve Loeppky <[email protected]>
Co-authored-by: Steve Loeppky <[email protected]>
…coin-project/lotus into feat/implement-index-validation-api
* feat: add event entry count in validation API * address comments
@BigLep Have addressed your second round of review on the RPC user doc. |
In case you need to rollback to the previous indexing system (`EthTxIndex`, `MsgIndex`, and `EventIndex`), follow these steps: | ||
1. Stop your Lotus node. | ||
2. Remove the current `${LOTUS_PATH}/sqlite` directory and replace it with the backup taken in the "**Backup Existing Index Databases**" section of the [Migration Guide](#migration-guide). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aarshkshah1992 : @rvagg and I spoke verbally on 2024-09-25 (my time) and he was supportive of this idea of writing chainindexer sqllite in a different directory.
Co-authored-by: Steve Loeppky <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a docs regard, things look good to me. I left a couple of small suggestions.
Co-authored-by: Steve Loeppky <[email protected]>
…/implement-index-validation-api
ChainIndexer Migration and Diagnostics Tooling
This PR implements the "migration" (really re-indexing / backfilling), and diagnostics tooling for the
ChainIndexer
implemented in PR #12450, and is part of the work for #12453. This tooling takes the form of both RPC APIs on the daemon andlotus-shed
CLI commands.Re-indexing Process
The re-indexing tool enables clients to index their entire existing ChainState in the
ChainIndexer
. This process is necessary due to the removal of the existingMsgIndex
,EthTxIndex
, andEventIndex
from Lotus.Why Re-index Instead of Migrate?
We've chosen to re-index rather than migrate data from existing indices for two primary reasons:
Instead, we're re-indexing the
Chainstore
/Chainstate
on the node into theChainIndexer
. This ensures that all re-indexed entries have gone through the indexing logic of the newChainIndexer
and that the Index is in sync/reflects the actual contents of theChainstore
/Chainstate
post re-indexing.Diagnostics Tooling
This PR introduces diagnostic tools for detecting corrupt Index entries at specific epochs or epoch ranges.
While this PR implements functionality for optionally backfilling missing Index entries, it does not yet include the capability to "repair" corrupted Indexed entries. The repair functionality will be introduced in a subsequent PR. This approach allows us to first gather and analyze user reports, helping us understand the types and causes of corrupted Indexed entries(and if all they exist in the new
ChainIndexer
) before implementing repair mechanisms.Core API
The fundamental building block for this tooling is the following RPC API:
This API has the following features:
Chainstore
state and the Indexed entries (tipset messages/events)lotus-shed
CLI toolingThe
lotus-shed
CLI tooling for both re-indexing/backfilling and diagnostics can then invoke this RPC API over epoch ranges. The correspondinglotus-shed backfill index [from, to]
andlotus-shed inspect index [from, to]
can then backfill/inspect/diagnose the Index for the given epoch ranges.TODO