The essence of your question seems to be regarding image quality. There was considerable literature on this subject, and the result is that image quality is difficult to determine.
Standard mathematical errors, such as signal-to-noise ratio (SNR) and standard error (MSE), can give a quantitative answer, but it is well known that they do not correlate well with the subjective opinions of the viewer, which should be our final authority. No other methods, even based on the psycho-visual models of the viewer (for example, S. A. Karunaseker and N. G. Kingsbury, “A measure of distortion to block artifacts in images based on human visual sensitivity,” IEEE Trans. On Image Proc., Volume 4 , No. 6, June 1995, pp. 713-724, and M. Miyahara, K. Kotani and V. R. Algazi, "Objective picture quality scale (PQS) for image encoding", IEEE Trans. On Comm., Volume 46, No. 9, September 1998, pp. 1215-1226), turned out to be better than SNR.
In addition, when you change the type of image (line drawing, cartoon, photograph, portrait, etc.), some types of compression distortion become more apparent. Mosquito noise may be undesirable in one image, while staircase noise may be the culprit in another.
In short, there is no answer to your question, "what can lead to better image quality?"
Speaking of which, we can say some things about DCT that matter. The pixels in the DCT block go from a low deviation to a large change in the zigzag pattern from the upper left corner [(0,0) → (0,1) → (1,0) → (2, 0) → (1,1) → (0 , 2) → etc.], since your choice of triangle mirrors. The closer the pixel is to the upper left corner, the smoother the information contained in it [in fact, the (0,0) DCT value is the average value for the entire block], and the further you get from this angle, the more “high-frequency” details you will get . The closer to the top and left of the image, the more horizontal and vertical details you will represent with this DCT coefficient, and the closer to the block diagonal, the more diagonal details you will have.
In short, lossy compression usually entails discarding some “parts” that may not be perceived by the eye. (Dropping "smoother" DCT values causes severe distortion.) The more DCT values you throw, the greater the compression ratio, but also the greater the distortion that you cause.
As for the block size, it all depends. The more differences and details in a block, the more you lose by throwing out the odds. Some compression algorithms adaptively use different block sizes in the same image, so that highly detailed areas get more and less blocks, and smooth areas get fewer and more blocks.
For algorithms that use the same block size, 8x8, 16x16, and 32x32 are common to things like JPEG and MPEG. The processing required to compress them will be smaller than the size of the adaptive block, but overall the quality will also be lower.