bug fix: truncate wide columns in print.data.table #7718#7758
bug fix: truncate wide columns in print.data.table #7718#7758venom1204 wants to merge 2 commits into
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #7758 +/- ##
==========================================
- Coverage 99.04% 99.04% -0.01%
==========================================
Files 87 87
Lines 17064 17063 -1
==========================================
- Hits 16901 16900 -1
Misses 163 163 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
joshhwuu
left a comment
There was a problem hiding this comment.
thanks, left a few comments! in the future, you can tag me as a reviewer instead of pinging me in the description. also please try to summarize changes in the description instead of referring to the issue so others can get context quicker 🙂
| # Current implementation may have issues when dealing with strings that have combinations of full-width and half-width characters, | ||
| # if this becomes a problem in the future, we could consider string traversal instead. | ||
| char.trunc = function(x, trunc.char = getOption("datatable.prettyprint.char")) { | ||
| if (is.null(trunc.char)) trunc.char = getOption("width") - 5L |
There was a problem hiding this comment.
just curious, why 5L? and this looks like a magic number, could we use a constant instead?
There was a problem hiding this comment.
my major reason of using 5L was because toby sir mentioned this would options(datatable.prettyprint.char=getOption("width")-5) be possible? in the issue
There was a problem hiding this comment.
would 5L still make sense if we had a large table and the row label prefix consists of many more characters? I would argue against a magic number or even reading the row label prefix if we can here
There was a problem hiding this comment.
you are right the 5L buffer fails once row labels grow bigger on a narrow consoles.
to fix this we can either
- increase the constant biffer instead of using 5L
- or we can pass the width dyanamically for precise fit.
There was a problem hiding this comment.
can you please provide a couple examples of when 5L works and does not work?
does is work for the examples I put in the original issue?
There was a problem hiding this comment.
its working fine for your example ,i was accounting about the final width because the 5L bugffer doent not account all layout components (row labels,spacing,'...')
for ex
options(width=25, datatable.prettyprint.char=NULL) dt = data.table( x = rep("ABCDEFGHIJKLMNOPQRSTUVWXYZ", 1000000) )
print(dt, topn=1)
>1000000: ABCDEFGHIJKLMNOPQRST...
if you count the truncated data only then its behaving correctly , however if teh fully rendered line is considered then its excedding the 25 limit.
| nchar_chars = nchar(x, 'char', allowNA=TRUE) | ||
| is_full_width = nchar_width > nchar_chars | ||
| idx = !is.na(x) & pmin(nchar_width, nchar_chars) > trunc.char | ||
| idx = !is.na(x) & !is.na(nchar_width) & pmin(nchar_width, nchar_chars) > trunc.char |
There was a problem hiding this comment.
could we use na.rm in pmin instead of a guard?
There was a problem hiding this comment.
i tried using na.rm = TRUE in pmin but found it causes NA values to be incorrectly flagged for truncation which leads to NA becoming "NA..." (breaking test 2253.2) and invalid 'width' argument crashes in strtrim.
|
Generated via commit 1af6897 Download link for the artifact containing the test results: ↓ atime-results.zip
|
| options(width=10, datatable.prettyprint.char=NULL) | ||
| test(2374.1, capture.output(print(data.table(x="12345678901234567890"))), output="12345...") |
There was a problem hiding this comment.
please use test(options=) instead of options(…), here and below
| options(width=10, datatable.prettyprint.char=NULL) | |
| test(2374.1, capture.output(print(data.table(x="12345678901234567890"))), output="12345...") | |
| test(2374.1, capture.output(print(data.table(x="12345678901234567890"))), output="12345...", options=list(width=10, datatable.prettyprint.char=NULL)) |

closes #7718
hi @joshhwuu
I incorporated the changes as proposed, below is a short summary of the changes
print.data.table.r- i updated thechar.truncto dafualtgetOption("width") - 5L, and also added condition to apply for lists. improved safety for strings with missing width values.Kindly review this when you have some time, and please let me know if you have any suggestions for improvement.
thank you.