• XSS.stack #1 – первый литературный журнал от юзеров форума

Splitting .json file

Check https://www.emeditor.com/
Best text editor for large files. I use for files with 5M lines or more no issue.
The way it makes it possible is that instead of opening files in RAM, emeditor uses Hard Drive instead of RAM. so it can open large files without any issues.
You can find it cracked if you don't want pay it.
Note: it's windows only
 
On Linux, you can do this using the `split` command, like so:

Bash:
split -n l/10 your_file.txt

Or alternatively, here is a simple cross platform Python script:

Python:
import os
import argparse
import math
parser = argparse.ArgumentParser(description='Split file into parts')
parser.add_argument('-n', type=int, dest='parts', help='number of parts to split file into')
parser.add_argument('--verbose', '-v', action='count', default=0)
parser.add_argument('file', nargs=1, help='name of file to be split')
args = parser.parse_args()
name = args.file[0]
parts = args.parts
verbose = args.verbose
# get command args for: 1) `name` of file & 2) `parts` to divide into 3) verbose
size = os.path.getsize(name)
chunk_size = math.floor(size / parts)
with open(name) as original_file:
    n = 1
    while (chunk := original_file.read(chunk_size)):
        chunk_name = f'{name}.part.{n}'
        with open(chunk_name, 'w+') as nth_chunk:
            nth_chunk.write(chunk)
            if verbose:
                print(chunk_name)
        n += 1

This should basically get the job done. If you want to reverse this process (ie, put the file back together), you can use this:

Python:
from os import listdir
for file in listdir('.'):
    file_parts = file.split('.')
    if len(file_parts) >= 3:
        try:
            int(file_parts[-1])
        except:
            pass
        if file_parts[-2] == 'part':
            output = '.'.join(file_parts[:-2])
output_file = open(output, 'w+')
parts = []
for file in listdir('.'):
    file_parts = file.split('.')
    if len(file_parts) >= 3:
        try:
            int(file_parts[-1])
        except:
            pass
        if file_parts[-2] == 'part':
            parts.append(file)
parts.sort(key=lambda x: int(x.split('.')[-1]))
for p in parts:
    with open(p) as part:
        output_file.write(part.read())
output_file.close()

Just some simple scripts I threw together, in case it helps! Otherwise, the split command or a premade solution is probably the way to go.

Good luck, and happy hacking :)
 


Напишите ответ...
  • Вставить:
Прикрепить файлы
Верх