Skip to content

Documented zero-length RHS behavior with by#7768

Open
venom1204 wants to merge 4 commits into
masterfrom
issue3912
Open

Documented zero-length RHS behavior with by#7768
venom1204 wants to merge 4 commits into
masterfrom
issue3912

Conversation

@venom1204
Copy link
Copy Markdown
Contributor

closes #3912

this pr adds documentation for special handeling of zero- length RHS values in := assignments when using by

hi @ben-schwen , can you please have a look at it when you got time.
thank you

@venom1204 venom1204 requested a review from MichaelChirico as a code owner May 29, 2026 19:20
@venom1204 venom1204 requested a review from joshhwuu May 29, 2026 19:20

#### Note on zero-length RHS and `by`

* If the `RHS` of an assignment results in a zero-length vector (e.g., `numeric(0)`), `data.table` will usually throw an error. However, when using `by`, a zero-length result for a specific group is treated as a no-op (the column remains unchanged for that group) and no error is thrown. This is intentional to allow functions that might return no data for certain subsets to complete without crashing the entire operation.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. what does usually mean here?

  2. remains unchanged also might not be true? what if we add a column with a zero-length RHS.

  3. what about set?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i will remove the usually mentioned and make the distinction explicit

  • regarding remains unchanged , you are right this only applies to existing columns, I checked the behavior for new columns as well, and groups returning a zero-length result are initialized with NA

  • for set i checked these

set(DT, j="x", value=numeric(0))

and set(DT, i=1L, j="x", value=numeric(0)) both gives error , so this special handelling apperas tobe specific to grouped := with by.

Comment thread man/assign.Rd
DT[i, (colvector) := val] # same (NOW PREFERRED) shorthand syntax. The parens are enough to stop the LHS being a symbol; same as c(colvector).
DT[i, colC := mean(colB), by = colA] # update (or add) column called "colC" by reference by group. A major feature of `:=`.
DT[,`:=`(new1 = sum(colB), new2 = sum(colC))] # Functional form
DT[, let(new1 = sum(colB), new2 = sum(colC))] # New alias for functional form.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add an example for a no-op?

DT[, x := if (.N > 2) sum(col) else integer(0), by=grp]

Comment thread man/assign.Rd
# 3. Multiple columns in place
# DT[i, names(.SD) := lapply(.SD, fx), by = ..., .SDcols = ...]

set(x, i = NULL, j, value)
Copy link
Copy Markdown
Member

@ben-schwen ben-schwen May 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually I just noticed right now that we never explain j and value

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi can you please have a look wheather this new version is correct or something needs to be changed.
thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Zero-length column re-assignment does not error using 'by'

2 participants