You can download hmBlogs from below links:

Raw Sample Cleaned Sample
Raw Corpus Cleaned Corpus

Due to the their size it may be recommended to check the samples first and then go to download the whole ones. The main corpora are about a thousand times bigger than the samples.

Files are encrypted. The password is: dont-forget-to-cite-hmblogs


So don't forget please:

@article{motahari2021hmblogs,

title={HmBlogs: A big general Persian corpus},
author={Motahari Khansari, Hamzeh and Shamsfard, Mehrnoush},
journal={arXiv e-prints},
pages={arXiv--2111},
year={2021}
}