When the total size of the project is too large (for example, my Project 11g), it will not be decompressed after compres

a6
a6611035
Posts: 21
Joined: Sat May 12, 2018 3:39 am
Platform: Windows

Mon Jun 03, 2019 2:11 pm Post

我的总项目大小是11g,但在设置中,当设置备份来选择压缩或工具栏中的主动压缩时,压缩文件为4g,打开并解压缩,提示错误,无法解压缩,但是当我的项目与小时相比,减压是没有问题的,因为我认为这是一个错误。
我希望修复这个bug。

rw
rwfranz
Posts: 762
Joined: Thu May 28, 2015 9:41 pm
Platform: Windows

Tue Jun 04, 2019 8:10 am Post

a6611035 wrote:我的总项目大小是11g,但在设置中,当设置备份来选择压缩或工具栏中的主动压缩时,压缩文件为4g,打开并解压缩,提示错误,无法解压缩,但是当我的项目与小时相比,减压是没有问题的,因为我认为这是一个错误。
我希望修复这个bug。

I do not read Chinese, but Google Translate says:
My total project size is 11g, but in the settings, when setting up backup to select compression or active compression in the toolbar, the compressed file is 4g, open and unzip, prompt error, can't unzip, but when my project Decompression is no problem compared to hours, because I think this is a mistake.
I hope to fix this bug.

Google translate is less than optimal, but it looks like a size issue with respect to the unzip algorithm being used. Shouldn't be one; an up-to-date zip algorithm has a file size limit of ~16 Exabytes.

st
steveshank
Posts: 348
Joined: Tue Mar 07, 2017 8:28 pm
Platform: Windows

Wed Jun 05, 2019 1:52 am Post

the compressed file is 4g

Unless I'm mistaken, Fat32 has a 4 gig size limit, so it could be the drive format, not the zip algorithm.

rw
rwfranz
Posts: 762
Joined: Thu May 28, 2015 9:41 pm
Platform: Windows

Wed Jun 05, 2019 11:24 am Post

steveshank wrote:
the compressed file is 4g

Unless I'm mistaken, Fat32 has a 4 gig size limit, so it could be the drive format, not the zip algorithm.

But the older zip algorithm could only handle 4gb files. So, at least 2 possible limiting issues here.

I built a huge project by creating a text of Lorem Ipsum (the generator online was very helpful) of 99 paragraphs, and copying it many times in a document. Added some large pictures (gargantuan maps are useful for this), and then duplicated the documents inside chapters, and duplicated the chapters many times.

Scrivener slowed incredibly at about 90 chapters of 11k words each.

At 12 GB (measured by 7zip), the interface simply crawls. Menus respond in minutes, not seconds. Honestly, I'm amazed it still functions. Forced a backup file, and exited.

Scrivener's Backup (Ziptest001.zip.bak) was 4,396,392,448 bytes. 4.09 GB.

However, there should have been two backups: 1, when I forced "backup now," and a 2nd when I closed the project. I didn't modify, so perhaps that's the reason. Scrivener did not like exiting (I honestly don't blame it; 12 GB... sheesh).

There are errors in the backup file.

"Unconfirmed start of archive"
"Data after payload data"
"Data error"

I'm running NTFS, which does not have a 4 gb limit.

When I used 7zip to create a zip of the project (including its folder), the size of the archive was 5.7 GB (48% compression). Took 10 mins instead of 30.

Based on this data, the zip functions are buggy with large projects. If these are QT code, those devs need to be warned of this.

User avatar
Sparrowhawk
Posts: 98
Joined: Thu Dec 05, 2013 4:49 pm
Platform: Mac, Win + iOS

Fri Jun 07, 2019 7:06 pm Post

Nothing useful to contribute here, just wanted to say you made me look up what EB was...

:shock:

And I found ZB... and then YB. :shock: :shock: :shock:

My brain now hurts.

Seriously, what single zip file could possibly be 16EB??? What kind of computer can even process that much data??

After all that, 11GB does not seem overly large, but still...

What (primarily) text document could possibly be so huge??
You will find more evidence of the ridiculousness of humanity in the bathroom mirror than any other place in the world.

Ji
JimRac
Posts: 1229
Joined: Wed Aug 27, 2014 2:06 pm
Platform: Win + iOS

Fri Jun 07, 2019 7:15 pm Post

Sparrowhawk wrote:What (primarily) text document could possibly be so huge??
It seems that most people with mega-sized projects are storing lots (and lots) of images.
I’m just a customer.

User avatar
kewms
Posts: 5435
Joined: Fri Feb 02, 2007 5:22 pm
Platform: Mac

Fri Jun 07, 2019 7:25 pm Post

You run into those kinds of volumes very quickly with machine learning datasets. A single autonomous car might generate 4 TB a day, and Facebook users collectively generate 4 petabytes a day.

For Scrivener projects, the really big ones are mostly incorporating a lot of video, although someone writing a heavily illustrated book (travel, photography, art history...) will obviously have a lot of images.

Katherine
Scrivener Support Team

rw
rwfranz
Posts: 762
Joined: Thu May 28, 2015 9:41 pm
Platform: Windows

Wed Aug 07, 2019 11:40 pm Post

Just tested this again, and the new zip algorithm seems broken.

I used the same 11.2 GB input project.

I used "Backup to..." and specified a directory. The algorithm took some time, BUT...

The final file (4.09 GB) was smaller than it should be (5.33 GB, I believe). This told me something was missing.

Unzipping with either 7zip or Peazip produces errors. Not all data is unzipped.

Out of 7746 directories in Files/Data, only 103 were found after decompression.

I'm testing further, to see if this is an upgrade issue (I used the internal upgrade, and that's been known to retain prior bugs), and about what size the thing should be (using 7zip's zip algorithm).

rw
rwfranz
Posts: 762
Joined: Thu May 28, 2015 9:41 pm
Platform: Windows

Thu Aug 08, 2019 12:48 am Post

TL;DR: Zip is still not fixed for large files.

Uninstalled all instances of Scrivener.
Reinstalled Beta 20.

The backup algorithm refuses to write a file larger than 4,293,562 KB (~4 GB). There is no warning that the file's size exceeds the backup's capacity to write.

7zip zips the entire Scriv folder into 5,596,429 KB (~5.3 KB). Tested twice.

Again, on 7zip decompressing, only 103 directories were restored out of more than 7000.

Errors were generated during decompression:
scriv_7z_unzip.PNG
scriv_7z_unzip.PNG (76.99 KiB) Viewed 406 times


In addition, odd behavior was noticed during Scrivener's compression phase. The file was 669KB until the very last few seconds. This suggests to me that the program is holding data in memory until it's finished, and if heap size is limited, this could be causing the problem. None of the other compression software I have installed does things this way.

I suggest, for time's sake, simply using an external like 7zip instead of building a zip algorithm into Scrivener. 7zip is faster, more efficient, et cetera, than whatever it is you've been using. I realize you want everything inside, but just as with other things, sometimes an external is the way to go.

User avatar
devinganger
Posts: 1725
Joined: Sat Nov 06, 2010 1:55 pm
Platform: Mac, Win + iOS
Location: Monroe, WA 98272 (CN97au)
Contact:

Thu Aug 08, 2019 11:27 pm Post

rwfranz wrote:I suggest, for time's sake, simply using an external like 7zip instead of building a zip algorithm into Scrivener. 7zip is faster, more efficient, et cetera, than whatever it is you've been using. I realize you want everything inside, but just as with other things, sometimes an external is the way to go.


Rather than build in a dependency on an external component, they probably just need to find an updated ZIP library that supports the ZIP64 format for dealing with files larger than 4GB.
--
Devin L. Ganger, WA7DLG
Not a L&L employee; opinions are those of my cat
Winner "Best in Class", 2018 My First Supervillain Photo Shoot

rw
rwfranz
Posts: 762
Joined: Thu May 28, 2015 9:41 pm
Platform: Windows

Fri Aug 09, 2019 8:04 pm Post

devinganger wrote:
rwfranz wrote:I suggest, for time's sake, simply using an external like 7zip instead of building a zip algorithm into Scrivener. 7zip is faster, more efficient, et cetera, than whatever it is you've been using. I realize you want everything inside, but just as with other things, sometimes an external is the way to go.


Rather than build in a dependency on an external component, they probably just need to find an updated ZIP library that supports the ZIP64 format for dealing with files larger than 4GB.


I thought that's what they'd done when they replaced the zip library. Evidently not. Of course, building in new, unfamiliar libraries is fraught with issues. It may not be the zip lib; it may be whatever code they're using to write to disk. It may be their memory model. It may be how they're calling the library; who knows.

But given that it's a simple backup, packing a (freely usable) external program with Scrivener (to make sure it's there) is not unreasonable (I get the idea that not doing that is likely better, but it's only better if it works).