Google Colab uses pandas to aggregate company sales data.
You have a G-suite contract with your company and i am logged in with your company account.

Csv format data is uploaded to Google Colab every week, processed with pandas, and then downloaded with csv.
I recently noticed that I was almost full when I looked at the disc.

If i check the status of the disk with "! df -h", it looks like this, but if the disk is full, sales data can not be aggregated, so
1. What kind of information is collected?
2. How to make it beautiful
I want to know.
I'm in trouble because I couldn't find the relevant article at all.
Please help me if anyone knows.
Thank you.

Filesystem Size Used Avail Use% Mounted on
overlay 49G 35G 13G 74%/
tmpfs 64M 0 64M 0%/dev
tmpfs 6.4G 0 6.4G 0%/sys/fs/cgroup
tmpfs 6.4G 8.0K 6.4G 1%/var/colab
/ dev/sda1 55G 36G 20G 65%/etc/hosts
shm 6.0G 4.0K 6.0G 1%/dev/shm
tmpfs 6.4G 0 6.4G 0%/proc/acpi
tmpfs 6.4G 0 6.4G 0%/proc/scsi
tmpfs 6.4G 0 6.4G 0%/sys/firmware

  • Answer # 1

    If you want to increase capacity, mount Google Drive and read/write files.

    Display the left pane and click "Mount a drive". A cell with the following code will be added.

    from google.colab import drive
    drive.mount ('/ content/drive')

    If you allow it, the code will be displayed. If you return to colaboratory and copy and paste it, it will be mounted.

    After the file is displayed, you can read/write it by copying the path and specifying the file.

    # read
    df = pd.read_csv ("/ content/drive/My Drive/test.csv")
    # write
    df.to_csv ("/ content/drive/My Drive/test.csv")

    33.89GB is used from the beginning.
    OS and development environment are installed in this.

    The code runs in a virtual machine dedicated to each account. Virtual machines have a system-defined expiration date, and virtual machines that have been idle for a period of time are reused.


    The expired virtual machine will be deleted, so there will be no files left at next startup.