You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: module/model/user/generated/10_model_schema.md
+4-2Lines changed: 4 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,7 +15,8 @@ Detail specification is defined by using `InferenceSchema` class, following are
15
15
|-------|------|-------------|-----------|
16
16
|`feature_types`| Dict[str, ValueType]| Mapping between feature name with the type of the feature | True |
17
17
|`model_prediction_output`| PredictionOutput | Prediction specification that differ between model types, e.g BinaryClassificationOutput, RegressionOutput, RankingOutput | True |
18
-
|`prediction_id_column`| str | The column name that contains prediction id value | True |
18
+
|`session_id_column`| str | The column name that is unique identifier for a request | True |
19
+
|`row_id_column`| str | The column name that is unique identifier for a row in a request | True |
19
20
|`tag_columns`| Optional[List[str]]| List of column names that contains additional information about prediction, you can treat it as metadata | False |
20
21
21
22
From above we can see `model_prediction_output` field that has type `PredictionOutput`, this field is a specification of prediction that is generated by the model depending on it's model type. Currently we support 3 model types in the schema:
@@ -73,7 +74,8 @@ from merlin.observability.inference import InferenceSchema, ValueType, BinaryCla
Copy file name to clipboardExpand all lines: module/model/user/generated/11_model_observability.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -33,17 +33,17 @@ Beside changes in signature, you can see some of those methods returning new typ
33
33
34
34
| Field | Type | Description|
35
35
|-------|------|------------|
36
-
|`prediction_ids`| List[str]| Unique identifier for each row in prediction |
37
-
|`features`| Union[Values, pandas.DataFrame]| Features value that is used by the model to generate prediction. Length of features should be the same with `prediction_ids`|
36
+
|`row_ids`| List[str]| Unique identifier for each row in prediction |
37
+
|`features`| Union[Values, pandas.DataFrame]| Features value that is used by the model to generate prediction. Length of features should be the same with `row_ids`|
38
38
|`entities`| Optional[Union[Values, pandas.DataFrame]]| Additional data that are not used for prediction, but this data is used to retrieved another features, e.g `driver_id`, we can retrieve features associated with certain `driver_id`|
39
-
|`session_id`| str | Identifier for the request. This value will be used together with `prediction_ids` as prediction identifier in model observability system |
39
+
|`session_id`| str | Identifier for the request. This value will be used together with `row_ids` as prediction identifier in model observability system |
40
40
41
41
`ModelInput` data is essential for model observability since it contains features values and identifier of prediction. Features values are used to calculate feature drift, and identifier is used as join key between features, prediction data with ground truth data. On the other hand, `ModelOutput` is the class that represent raw model prediction output, not the final output of PyFunc model. `ModelOutput` class contains following fields:
42
42
43
43
| Field | Type | Description |
44
44
|-------|------|-------------|
45
45
|`prediction`| Values |`predictions` contains prediction output from ml_predict, it may contains multiple columns e.g for multiclass classification or for binary classification that contains prediction score and label |
46
-
|`prediction_ids`| List[str]| Unique identifier for each row in prediction output |
46
+
|`row_ids`| List[str]| Unique identifier for each row in prediction output |
47
47
48
48
Same like `ModelInput`, `ModelOutput` is also essential for model observability, it can be used to calculate prediction drift but more importantly it can calculate performance metrics.
49
49
@@ -61,21 +61,21 @@ There is not much change on the deployment part, users just needs to set `enable
61
61
* featureC that has string type
62
62
* featureD that has float type
63
63
64
-
The model type is ranking with prediction group id information is located in `session_id` column, prediction id in `prediction_id` column, rank score in `score` column and `relevance_score_column` in `relevance_score`. Below is the snipped of the python code
64
+
The model type is ranking with prediction group id information is located in `session_id` column, row id in `row_id` column, rank score in `score` column and `relevance_score_column` in `relevance_score`. Below is the snipped of the python code
Copy file name to clipboardExpand all lines: module/model/user/templates/10_model_schema.md
+4-2Lines changed: 4 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,7 +15,8 @@ Detail specification is defined by using `InferenceSchema` class, following are
15
15
|-------|------|-------------|-----------|
16
16
|`feature_types`| Dict[str, ValueType]| Mapping between feature name with the type of the feature | True |
17
17
|`model_prediction_output`| PredictionOutput | Prediction specification that differ between model types, e.g BinaryClassificationOutput, RegressionOutput, RankingOutput | True |
18
-
|`prediction_id_column`| str | The column name that contains prediction id value | True |
18
+
|`session_id_column`| str | The column name that is unique identifier for a request | True |
19
+
|`row_id_column`| str | The column name that is unique identifier for a row in a request | True |
19
20
|`tag_columns`| Optional[List[str]]| List of column names that contains additional information about prediction, you can treat it as metadata | False |
20
21
21
22
From above we can see `model_prediction_output` field that has type `PredictionOutput`, this field is a specification of prediction that is generated by the model depending on it's model type. Currently we support 3 model types in the schema:
@@ -73,7 +74,8 @@ from merlin.observability.inference import InferenceSchema, ValueType, BinaryCla
Copy file name to clipboardExpand all lines: module/model/user/templates/11_model_observability.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -33,17 +33,17 @@ Beside changes in signature, you can see some of those methods returning new typ
33
33
34
34
| Field | Type | Description|
35
35
|-------|------|------------|
36
-
|`prediction_ids`| List[str]| Unique identifier for each row in prediction |
37
-
|`features`| Union[Values, pandas.DataFrame]| Features value that is used by the model to generate prediction. Length of features should be the same with `prediction_ids`|
36
+
|`row_ids`| List[str]| Unique identifier for each row in prediction |
37
+
|`features`| Union[Values, pandas.DataFrame]| Features value that is used by the model to generate prediction. Length of features should be the same with `row_ids`|
38
38
|`entities`| Optional[Union[Values, pandas.DataFrame]]| Additional data that are not used for prediction, but this data is used to retrieved another features, e.g `driver_id`, we can retrieve features associated with certain `driver_id`|
39
-
|`session_id`| str | Identifier for the request. This value will be used together with `prediction_ids` as prediction identifier in model observability system |
39
+
|`session_id`| str | Identifier for the request. This value will be used together with `row_ids` as prediction identifier in model observability system |
40
40
41
41
`ModelInput` data is essential for model observability since it contains features values and identifier of prediction. Features values are used to calculate feature drift, and identifier is used as join key between features, prediction data with ground truth data. On the other hand, `ModelOutput` is the class that represent raw model prediction output, not the final output of PyFunc model. `ModelOutput` class contains following fields:
42
42
43
43
| Field | Type | Description |
44
44
|-------|------|-------------|
45
45
|`prediction`| Values |`predictions` contains prediction output from ml_predict, it may contains multiple columns e.g for multiclass classification or for binary classification that contains prediction score and label |
46
-
|`prediction_ids`| List[str]| Unique identifier for each row in prediction output |
46
+
|`row_ids`| List[str]| Unique identifier for each row in prediction output |
47
47
48
48
Same like `ModelInput`, `ModelOutput` is also essential for model observability, it can be used to calculate prediction drift but more importantly it can calculate performance metrics.
49
49
@@ -61,21 +61,21 @@ There is not much change on the deployment part, users just needs to set `enable
61
61
* featureC that has string type
62
62
* featureD that has float type
63
63
64
-
The model type is ranking with prediction group id information is located in `session_id` column, prediction id in `prediction_id` column, rank score in `score` column and `relevance_score_column` in `relevance_score`. Below is the snipped of the python code
64
+
The model type is ranking with prediction group id information is located in `session_id` column, row id in `row_id` column, rank score in `score` column and `relevance_score_column` in `relevance_score`. Below is the snipped of the python code
0 commit comments