Взаимодействие с файловой системой#
Нередко требуется программными средствами взаимодействовать с файловой системой и в стандартной библиотеке python реализовано много инструментов, значительно упрощающих этот процесс.
Путь к файлу/директории#
Путь (англ. path) — набор символов, показывающий расположение файла или каталога в файловой системе (источник — wikipedia). В программных средах путь необходим, например, для того, чтобы открывать и сохранять файлы. В большинстве случаев в python путь представляется в виде обычного строкового объекта.
Обычно путь представляет собой последовательность вложенных каталогов, разделенных специальным символом, при этом разделитель каталогов может меняться в зависимости от операционной системы: в OS Windows используется “ \ ”, в unix-like системах — “ / ”. Кроме того, важно знать, что пути бывают абсолютными и относительными. Абсолютный путь всегда начинается с корневого каталога файловой системы (в OS Windows — это логический раздел (например, “C:”), в UNIX-like системах — “/”) и всегда указывает на один и тот же файл (или директорию). Относительный путь, наоборот, не начинается с корневого каталога и указывает расположение относительно текущего рабочего каталога, а значит будет указывать на совершено другой файл, если поменять рабочий каталог.
pathlib — Object-oriented filesystem paths¶
This module offers classes representing filesystem paths with semantics appropriate for different operating systems. Path classes are divided between pure paths , which provide purely computational operations without I/O, and concrete paths , which inherit from pure paths but also provide I/O operations.

If you’ve never used this module before or just aren’t sure which class is right for your task, Path is most likely what you need. It instantiates a concrete path for the platform the code is running on.
Pure paths are useful in some special cases; for example:
If you want to manipulate Windows paths on a Unix machine (or vice versa). You cannot instantiate a WindowsPath when running on Unix, but you can instantiate PureWindowsPath .
You want to make sure that your code only manipulates paths without actually accessing the OS. In this case, instantiating one of the pure classes may be useful since those simply don’t have any OS-accessing operations.
PEP 428: The pathlib module – object-oriented filesystem paths.
For low-level path manipulation on strings, you can also use the os.path module.
Basic use¶
Importing the main class:
Listing Python source files in this directory tree:
Navigating inside a directory tree:
Querying path properties:
Pure paths¶
Pure path objects provide path-handling operations which don’t actually access a filesystem. There are three ways to access these classes, which we also call flavours:
class pathlib. PurePath ( * pathsegments ) ¶
A generic class that represents the system’s path flavour (instantiating it creates either a PurePosixPath or a PureWindowsPath ):
Each element of pathsegments can be either a string representing a path segment, an object implementing the os.PathLike interface which returns a string, or another path object:
When pathsegments is empty, the current directory is assumed:
If a segment is an absolute path, all previous segments are ignored (like os.path.join() ):
On Windows, the drive is not reset when a rooted relative path segment (e.g., r’\foo’ ) is encountered:
Spurious slashes and single dots are collapsed, but double dots ( ‘..’ ) and leading double slashes ( ‘//’ ) are not, since this would change the meaning of a path for various reasons (e.g. symbolic links, UNC paths):
(a naïve approach would make PurePosixPath(‘foo/../bar’) equivalent to PurePosixPath(‘bar’) , which is wrong if foo is a symbolic link to another directory)
Pure path objects implement the os.PathLike interface, allowing them to be used anywhere the interface is accepted.
Changed in version 3.6: Added support for the os.PathLike interface.
A subclass of PurePath , this path flavour represents non-Windows filesystem paths:
pathsegments is specified similarly to PurePath .
class pathlib. PureWindowsPath ( * pathsegments ) ¶
A subclass of PurePath , this path flavour represents Windows filesystem paths, including UNC paths:
pathsegments is specified similarly to PurePath .
Regardless of the system you’re running on, you can instantiate all of these classes, since they don’t provide any operation that does system calls.
General properties¶
Paths are immutable and hashable . Paths of a same flavour are comparable and orderable. These properties respect the flavour’s case-folding semantics:
Paths of a different flavour compare unequal and cannot be ordered:
Operators¶
The slash operator helps create child paths, like os.path.join() . If the argument is an absolute path, the previous path is ignored. On Windows, the drive is not reset when the argument is a rooted relative path (e.g., r’\foo’ ):
A path object can be used anywhere an object implementing os.PathLike is accepted:
The string representation of a path is the raw filesystem path itself (in native form, e.g. with backslashes under Windows), which you can pass to any function taking a file path as a string:
Similarly, calling bytes on a path gives the raw filesystem path as a bytes object, as encoded by os.fsencode() :
Calling bytes is only recommended under Unix. Under Windows, the unicode form is the canonical representation of filesystem paths.
Accessing individual parts¶
To access the individual “parts” (components) of a path, use the following property:
A tuple giving access to the path’s various components:
(note how the drive and local root are regrouped in a single part)
Methods and properties¶
Pure paths provide the following methods and properties:
A string representing the drive letter or name, if any:
UNC shares are also considered drives:
A string representing the (local or global) root, if any:
UNC shares always have a root:
If the path starts with more than two successive slashes, PurePosixPath collapses them:
This behavior conforms to The Open Group Base Specifications Issue 6, paragraph 4.11 Pathname Resolution:
“A pathname that begins with two successive slashes may be interpreted in an implementation-defined manner, although more than two leading slashes shall be treated as a single slash.”
The concatenation of the drive and root:
An immutable sequence providing access to the logical ancestors of the path:
Changed in version 3.10: The parents sequence now supports slices and negative index values.
The logical parent of the path:
You cannot go past an anchor, or empty path:
This is a purely lexical operation, hence the following behaviour:
If you want to walk an arbitrary filesystem path upwards, it is recommended to first call Path.resolve() so as to resolve symlinks and eliminate ".." components.
A string representing the final path component, excluding the drive and root, if any:
UNC drive names are not considered:
The file extension of the final component, if any:
A list of the path’s file extensions:
The final path component, without its suffix:
Return a string representation of the path with forward slashes ( / ):
Represent the path as a file URI. ValueError is raised if the path isn’t absolute.
Return whether the path is absolute or not. A path is considered absolute if it has both a root and (if the flavour allows) a drive:
Return whether or not this path is relative to the other path.
New in version 3.9.
With PureWindowsPath , return True if the path is considered reserved under Windows, False otherwise. With PurePosixPath , False is always returned.
File system calls on reserved paths can fail mysteriously or have unintended effects.
PurePath. joinpath ( * other ) ¶
Calling this method is equivalent to combining the path with each of the other arguments in turn:
Match this path against the provided glob-style pattern. Return True if matching is successful, False otherwise.
If pattern is relative, the path can be either relative or absolute, and matching is done from the right:
If pattern is absolute, the path must be absolute, and the whole path must match:
As with other methods, case-sensitivity follows platform defaults:
Compute a version of this path relative to the path represented by other. If it’s impossible, ValueError is raised:
NOTE: This function is part of PurePath and works with strings. It does not check or access the underlying file structure.
PurePath. with_name ( name ) ¶
Return a new path with the name changed. If the original path doesn’t have a name, ValueError is raised:
Return a new path with the stem changed. If the original path doesn’t have a name, ValueError is raised:
New in version 3.9.
Return a new path with the suffix changed. If the original path doesn’t have a suffix, the new suffix is appended instead. If the suffix is an empty string, the original suffix is removed:
Concrete paths¶
Concrete paths are subclasses of the pure path classes. In addition to operations provided by the latter, they also provide methods to do system calls on path objects. There are three ways to instantiate concrete paths:
class pathlib. Path ( * pathsegments ) ¶
A subclass of PurePath , this class represents concrete paths of the system’s path flavour (instantiating it creates either a PosixPath or a WindowsPath ):
pathsegments is specified similarly to PurePath .
class pathlib. PosixPath ( * pathsegments ) ¶
A subclass of Path and PurePosixPath , this class represents concrete non-Windows filesystem paths:
pathsegments is specified similarly to PurePath .
class pathlib. WindowsPath ( * pathsegments ) ¶
A subclass of Path and PureWindowsPath , this class represents concrete Windows filesystem paths:
pathsegments is specified similarly to PurePath .
You can only instantiate the class flavour that corresponds to your system (allowing system calls on non-compatible path flavours could lead to bugs or failures in your application):
Methods¶
Concrete paths provide the following methods in addition to pure paths methods. Many of these methods can raise an OSError if a system call fails (for example because the path doesn’t exist).
Changed in version 3.8: exists() , is_dir() , is_file() , is_mount() , is_symlink() , is_block_device() , is_char_device() , is_fifo() , is_socket() now return False instead of raising an exception for paths that contain characters unrepresentable at the OS level.
Return a new path object representing the current directory (as returned by os.getcwd() ):
Return a new path object representing the user’s home directory (as returned by os.path.expanduser() with
construct). If the home directory can’t be resolved, RuntimeError is raised.
New in version 3.5.
Return a os.stat_result object containing information about this path, like os.stat() . The result is looked up at each call to this method.
This method normally follows symlinks; to stat a symlink add the argument follow_symlinks=False , or use lstat() .
Changed in version 3.10: The follow_symlinks parameter was added.
Change the file mode and permissions, like os.chmod() .
This method normally follows symlinks. Some Unix flavours support changing permissions on the symlink itself; on these platforms you may add the argument follow_symlinks=False , or use lchmod() .
Changed in version 3.10: The follow_symlinks parameter was added.
Whether the path points to an existing file or directory:
If the path points to a symlink, exists() returns whether the symlink points to an existing file or directory.
Return a new path with expanded
user constructs, as returned by os.path.expanduser() . If a home directory can’t be resolved, RuntimeError is raised.
New in version 3.5.
Glob the given relative pattern in the directory represented by this path, yielding all matching files (of any kind):
Patterns are the same as for fnmatch , with the addition of “ ** ” which means “this directory and all subdirectories, recursively”. In other words, it enables recursive globbing:
Using the “ ** ” pattern in large directory trees may consume an inordinate amount of time.
Raises an auditing event pathlib.Path.glob with arguments self , pattern .
Changed in version 3.11: Return only directories if pattern ends with a pathname components separator ( sep or altsep ).
Return the name of the group owning the file. KeyError is raised if the file’s gid isn’t found in the system database.
Return True if the path points to a directory (or a symbolic link pointing to a directory), False if it points to another kind of file.
False is also returned if the path doesn’t exist or is a broken symlink; other errors (such as permission errors) are propagated.
Return True if the path points to a regular file (or a symbolic link pointing to a regular file), False if it points to another kind of file.
False is also returned if the path doesn’t exist or is a broken symlink; other errors (such as permission errors) are propagated.
Return True if the path is a mount point: a point in a file system where a different file system has been mounted. On POSIX, the function checks whether path’s parent, path/.. , is on a different device than path, or whether path/.. and path point to the same i-node on the same device — this should detect mount points for all Unix and POSIX variants. Not implemented on Windows.
New in version 3.7.
Return True if the path points to a symbolic link, False otherwise.
False is also returned if the path doesn’t exist; other errors (such as permission errors) are propagated.
Return True if the path points to a Unix socket (or a symbolic link pointing to a Unix socket), False if it points to another kind of file.
False is also returned if the path doesn’t exist or is a broken symlink; other errors (such as permission errors) are propagated.
Return True if the path points to a FIFO (or a symbolic link pointing to a FIFO), False if it points to another kind of file.
False is also returned if the path doesn’t exist or is a broken symlink; other errors (such as permission errors) are propagated.
Return True if the path points to a block device (or a symbolic link pointing to a block device), False if it points to another kind of file.
False is also returned if the path doesn’t exist or is a broken symlink; other errors (such as permission errors) are propagated.
Return True if the path points to a character device (or a symbolic link pointing to a character device), False if it points to another kind of file.
False is also returned if the path doesn’t exist or is a broken symlink; other errors (such as permission errors) are propagated.
When the path points to a directory, yield path objects of the directory contents:
The children are yielded in arbitrary order, and the special entries ‘.’ and ‘..’ are not included. If a file is removed from or added to the directory after creating the iterator, whether a path object for that file be included is unspecified.
Path. lchmod ( mode ) ¶
Like Path.chmod() but, if the path points to a symbolic link, the symbolic link’s mode is changed rather than its target’s.
Like Path.stat() but, if the path points to a symbolic link, return the symbolic link’s information rather than its target’s.
Path. mkdir ( mode = 0o777 , parents = False , exist_ok = False ) ¶
Create a new directory at this given path. If mode is given, it is combined with the process’ umask value to determine the file mode and access flags. If the path already exists, FileExistsError is raised.
If parents is true, any missing parents of this path are created as needed; they are created with the default permissions without taking mode into account (mimicking the POSIX mkdir -p command).
If parents is false (the default), a missing parent raises FileNotFoundError .
If exist_ok is false (the default), FileExistsError is raised if the target directory already exists.
If exist_ok is true, FileExistsError exceptions will be ignored (same behavior as the POSIX mkdir -p command), but only if the last path component is not an existing non-directory file.
Changed in version 3.5: The exist_ok parameter was added.
Open the file pointed to by the path, like the built-in open() function does:
Return the name of the user owning the file. KeyError is raised if the file’s uid isn’t found in the system database.
Return the binary contents of the pointed-to file as a bytes object:
New in version 3.5.
Return the decoded contents of the pointed-to file as a string:
The file is opened and then closed. The optional parameters have the same meaning as in open() .
New in version 3.5.
Return the path to which the symbolic link points (as returned by os.readlink() ):
New in version 3.9.
Rename this file or directory to the given target, and return a new Path instance pointing to target. On Unix, if target exists and is a file, it will be replaced silently if the user has permission. On Windows, if target exists, FileExistsError will be raised. target can be either a string or another path object:
The target path may be absolute or relative. Relative paths are interpreted relative to the current working directory, not the directory of the Path object.
It is implemented in terms of os.rename() and gives the same guarantees.
Changed in version 3.8: Added return value, return the new Path instance.
Rename this file or directory to the given target, and return a new Path instance pointing to target. If target points to an existing file or empty directory, it will be unconditionally replaced.
The target path may be absolute or relative. Relative paths are interpreted relative to the current working directory, not the directory of the Path object.
Changed in version 3.8: Added return value, return the new Path instance.
Make the path absolute, without normalization or resolving symlinks. Returns a new path object:
Make the path absolute, resolving any symlinks. A new path object is returned:
“ .. ” components are also eliminated (this is the only method to do so):
If the path doesn’t exist and strict is True , FileNotFoundError is raised. If strict is False , the path is resolved as far as possible and any remainder is appended without checking whether it exists. If an infinite loop is encountered along the resolution path, RuntimeError is raised.
New in version 3.6: The strict argument (pre-3.6 behavior is strict).
This is like calling Path.glob() with “ **/ ” added in front of the given relative pattern:
Raises an auditing event pathlib.Path.rglob with arguments self , pattern .
Changed in version 3.11: Return only directories if pattern ends with a pathname components separator ( sep or altsep ).
Remove this directory. The directory must be empty.
Path. samefile ( other_path ) ¶
Return whether this path points to the same file as other_path, which can be either a Path object, or a string. The semantics are similar to os.path.samefile() and os.path.samestat() .
An OSError can be raised if either file cannot be accessed for some reason.
New in version 3.5.
Make this path a symbolic link to target. Under Windows, target_is_directory must be true (default False ) if the link’s target is a directory. Under POSIX, target_is_directory’s value is ignored.
The order of arguments (link, target) is the reverse of os.symlink() ’s.
Make this path a hard link to the same file as target.
The order of arguments (link, target) is the reverse of os.link() ’s.
New in version 3.10.
Make target a hard link to this path.
This function does not make this path a hard link to target, despite the implication of the function and argument names. The argument order (target, link) is the reverse of Path.symlink_to() and Path.hardlink_to() , but matches that of os.link() .
New in version 3.8.
Deprecated since version 3.10: This method is deprecated in favor of Path.hardlink_to() , as the argument order of Path.link_to() does not match that of Path.symlink_to() .
Create a file at this given path. If mode is given, it is combined with the process’ umask value to determine the file mode and access flags. If the file already exists, the function succeeds if exist_ok is true (and its modification time is updated to the current time), otherwise FileExistsError is raised.
Path. unlink ( missing_ok = False ) ¶
Remove this file or symbolic link. If the path points to a directory, use Path.rmdir() instead.
If missing_ok is false (the default), FileNotFoundError is raised if the path does not exist.
If missing_ok is true, FileNotFoundError exceptions will be ignored (same behavior as the POSIX rm -f command).
Changed in version 3.8: The missing_ok parameter was added.
Open the file pointed to in bytes mode, write data to it, and close the file:
An existing file of the same name is overwritten.
New in version 3.5.
Open the file pointed to in text mode, write data to it, and close the file:
An existing file of the same name is overwritten. The optional parameters have the same meaning as in open() .
New in version 3.5.
Changed in version 3.10: The newline parameter was added.
Correspondence to tools in the os module¶
Below is a table mapping various os functions to their corresponding PurePath / Path equivalent.
Python: указание пути к файлу
В Python, чтобы указать путь к файлу, вы можете использовать относительный или абсолютный путь. Относительный путь определяется относительно текущего каталога, а абсолютный путь определяется относительно корневого каталога вашей файловой системы.
Относительный путь
Чтобы указать относительный путь к файлу, вы можете использовать его имя и путь относительно текущего каталога. Например:
Абсолютный путь
Чтобы указать абсолютный путь к файлу, вы должны указать полный путь к файлу на вашей файловой системе. Например:
В обоих случаях, когда вы укажете путь к файлу, Python будет искать его в указанном месте и открывать для чтения (в примере это обозначено параметром «r»). Если файл не существует, Python вызовет исключение «FileNotFoundError».
Python 3 Quick Tip: The easy way to deal with file paths on Windows, Mac and Linux
![]()
One of programming’s little annoyances is that Microsoft Windows uses a backslash character between folder names while almost every other computer uses a forward slash:
This is an accident of early 1980’s computer history. The first version of MS-DOS used the forward slash character for specifying command-line options. When Microsoft added support for folders in MS-DOS 2.0, the forward slash character was already taken so they used a backslash instead. Thirty-five years later, we are still stuck with this incompatibility.
If you want your Python code to work on both Windows and Mac/Linux, you’ll need to deal with these kinds of platform-specific issues. Luckily, Python 3 has a new module called pathlib that makes working with files nearly painless.
Let’s take a quick look at the different ways of handling filename paths and see how pathlib can make your life better!
The Wrong Solution: Building File Paths by Hand
Let’s say you have a data folder that contains a file that you want to open in your Python program:
This is the wrong way to code it in Python:
Notice that I’ve hardcoded the path using Unix-style forward slashes since I’m on a Mac. This will make Windows users angry.
Technically this code will still work on Windows because Python has a hack where it will recognize either kind of slash when you call open() on Windows. But even still, you shouldn’t depend on that. Not all Python libraries will work if you use wrong kind of slash on the wrong operating system — especially if they interface with external programs or libraries.
And Python’s support for mixing slash types is a Windows-only hack that doesn’t work in reverse. Using backslashes in code will totally fail on a Mac:
For all these reasons and more, writing code with hardcoded path strings is the kind of thing that will make other programmers look at you with great suspicion. In general, you should try to avoid it.
The Old Solution: Python’s os.path module
Python’s os.path module has lots of tools for working around these kinds of operating system-specific file system issues.
You can use os.path.join() to build a path string using the right kind of slash for the current operating system:
This code will work perfectly on both Windows or Mac. The problem is that it’s a pain to use. Writing out os.path.join() and passing in each part of the path as a separate string is wordy and unintuitive.
Since most of the functions in the os.path module are similarly annoying to use, developers often “forget” to use them even when they know better. This leads to a lot of cross-platform bugs and angry users.
The Better Solution: Python 3’s pathlib!
Python 3.4 introduced a new standard library for dealing with files and paths called pathlib — and it’s great!
To use it, you just pass a path or filename into a new Path() object using forward slashes and it handles the rest:
Notice two things here:
- You should use forward slashes with pathlib functions. The Path() object will convert forward slashes into the correct kind of slash for the current operating system. Nice!
- If you want to add on to the path, you can use the / operator directly in your code. Say goodbye to typing out os.path.join(a, b) over and over.
And if that’s all pathlib did, it would be a nice addition to Python — but it does a lot more!
For example, we can read the contents of a text file without having to mess with opening and closing the file:
In fact, pathlib makes most standard file operations quick and easy:
You can even use pathlib to explicitly convert a Unix path into a Windows-formatted path:
And if you REALLY want to use backslashes in your code safely, you can declare your path as Windows-formatted and pathlib can convert it to work on the current operating system:
If you want to get fancy, you can even use pathlib to do things like resolve relative file paths, parse network share paths and generate file:// urls. Here’s an example that will open a local file in your web browser with just two lines a code:
This was just a tiny peak at pathlib. It’s a great replacement for lots of different file-related functionality that used to be scattered around different Python modules. Check it out!
Thanks for reading! If you are interested in Machine Learning (or just want to understand what it is), check out my Machine Learning is Fun! series or sign up for my newsletter: