How to get all comments from Disqus?

I want to get all comments on CNN, whose comment system is Disqus. For example, http://edition.cnn.com/2013/02/25/tech/innovation/google-glass-privacy-andrew-keen/index.html?hpt=hp_c1

The comment system requires us to click "download more" so that we can see more comments. I tried using php to parse html, but it has not been able to load all comments since using javascript. So I'm wondering if anyone has a more convenient way to get all comments from a specific cnn url.

Has anyone done this successfully? thanks in advance

+6
source share
3 answers

The Disqus API contains a pagination method using cursors returned in a JSON response. See here for cursor information: http://disqus.com/api/docs/cursors/

Since you mentioned PHP, you need to start something like this:

<?php $apikey = '<your key here>'; // get keys at http://disqus.com/api/ — can be public or secret for this endpoint $shortname = '<the disqus forum shortname>'; // defined in the var disqus_shortname = '...'; $thread = 'link:<URL of thread>'; // IMPORTANT the URL that you're viewing isn't necessarily the one stored with the thread of comments //$thread = 'ident:<identifier of thread>'; Use this if 'link:' has no results. Defined in 'var disqus_identifier = '...'; $limit = '100'; // max is 100 for this endpoint. 25 is default $endpoint = 'https://disqus.com/api/3.0/threads/listPosts.json?api_key='.$apikey.'&forum='.$shortname.'&limit='.$limit.'&cursor='.$cursor; $j=0; listcomments($endpoint,$cursor,$j); function listcomments($endpoint,$cursor,$j) { // Standard CURL $session = curl_init($endpoint.$cursor); curl_setopt($session, CURLOPT_RETURNTRANSFER, 1); // instead of just returning true on success, return the result on success $data = curl_exec($session); curl_close($session); // Decode JSON data $results = json_decode($data); if ($results === NULL) die('Error parsing json'); // Comment response $comments = $results->response; // Cursor for pagination $cursor = $results->cursor; $i=0; foreach ($comments as $comment) { $name = $comment->author->name; $comment = $comment->message; $created = $comment->createdAt; // Get more data... echo "<p>".$name." wrote:<br/>"; echo $comment."<br/>"; echo $created."</p>"; $i++; } // cursor through until today if ($i == 100) { $cursor = $cursor->next; $i = 0; listcomments($endpoint,$cursor); /* uncomment to only run $j number of iterations $j++; if ($j < 10) { listcomments($endpoint,$cursor,$j); }*/ } } ?> 
+6
source

Just add: to get the URL of disqus comments on any page found, run this JavaScript code in your web browser console:

 var visit = function () { var url = document.querySelector('div#disqus_thread iframe').src; String.prototype.startsWith = function (check) { return(this.indexOf(check) == 0); }; if (!url.startsWith('https://')) return url.slice(0, 4) + "s" + url.slice(4); return url; }(); 

Since the variable is now in 'visit'

 console.log(visit); 

I helped you collect all the data in json UTF-8 format, saved it in .txt, and it can be found on this . The json format contains variable names, but the one you need is the 'data' variable, which is a JavaScript array.

Swipe through each of them, and then divide them by "x == x". "X == x" was made to ensure that the user ID of those who made the comments was also captured. In a situation where there is no user identifier in a numeric format, but a name, this means that the account is no longer active.

To use userid, this is a question https://disqus.com/users/106222183 , where 106222183 is the user ID

+3
source

without api:

 #disqus_thread { position: relative; height: 300px; background-color: #fff; overflow: hidden; } #disqus_thread:after { content: ""; display: block; height: 10px; width: 100%; position: absolute; bottom: 0; background: white; } #disqus_thread.loaded { height: auto; } #disqus_thread.loaded:after{ height:55px; } #disqus-load { text-align: center; color: #fff; padding: 11px 14px; font-size: 13px; font-weight: 500; display: block; text-align: center; border: none; background: rgba(29,47,58,.6); line-height: 1.1; border-radius: 3px; font-weight: 500; transition: background .2s; text-shadow: none; cursor:pointer; } <div class="disqus-comments"> <div id='disqus_thread'></div> <div id='disqus-load'>Load comments</div> </div> <script type="text/javascript"> $(document).ready(function() { var disqus_shortname = 'testare-123'; (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js'; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })(); $('#disqus-load').on('click', function(){ $.ajax({ type: "GET", url: "http://" + disqus_shortname + ".disqus.com/embed.js", dataType: "script", cache: true }); $(this).fadeOut(); $('#disqus_thread').addClass('loaded'); }); }); /* * * CONFIGURATION VARIABLES * * */ // var disqus_shortname = 'testare-123'; // /* * * DON'T EDIT BELOW THIS LINE * * */ // (function() { // var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; // dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js'; // (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); // })(); </script> <noscript>Please enable JavaScript to view the <a href="https://disqus.com/?ref_noscript" rel="nofollow">comments powered by Disqus.</a></noscript> 
-1
source

All Articles